Class CacheNode
- All Implemented Interfaces:
MessageHandler
Replication Model (v1): Full Replication
All nodes participate in a SINGLE Raft group. Every node holds ALL data. This is similar to etcd and ZooKeeper — simple, strong consistency, at the cost of storage efficiency for very large datasets.Writes: go through Raft consensus on the leader, replicated to all followers. Reads: served locally on any node (leader uses lease-based quorum reads).
Cluster Isolation
Every node is bound to aClusterConfig that carries
a cluster name/id. Nodes with different cluster ids ignore each other's JOIN
requests, which enables parallel integration tests with isolated clusters.
Lifecycle
1. start() — binds TCP port, starts accept loop 2. connectToSeeds — connects to seeds, exchanges cluster ID 3. heartbeat — pings peers every 2s 4. failure detect — marks peers dead after 6s silence 5. stop() — graceful shutdown-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final recordstatic final record -
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionCacheNode(ClusterConfig config) Primary constructor — uses ClusterConfig.CacheNode(ClusterConfig config, @Nullable io.micrometer.core.instrument.MeterRegistry meterRegistry) Primary constructor with custom MeterRegistry for metrics collection.Legacy constructor — wraps into ClusterConfig with a random cluster UUID. -
Method Summary
Modifier and TypeMethodDescriptionvoidaddMigrationListener(MigrationListener listener) Register an operator listener for partition migration lifecycle events.voidaddSecurityInterceptor(SecurityInterceptor interceptor) Register a global security interceptor invoked around external client operations.voidaddSocketInterceptor(SocketInterceptor interceptor) Register a layer-4 socket interceptor for accepted inbound TCP connections.booleanawaitStarted(long timeout, TimeUnit unit) changeClusterVersion(LoomVersion newVersion) Change the cluster version gate and return the previous version.intReturns the number of active TCP connections to peer nodes.booleanforceCloseCpSession(String sessionId) Force-closes a CP session and releases its tracked resources.booleanForce the local Raft leader to step down for operator-driven failover or maintenance.Return the cluster version gate for rolling-upgrade operations.Return the current cluster operational state.Narrow view ofraftNodefor callers that only need the query/submitRaftNodeApi.@Nullable MessagehandleMessage(Message message, ConnectionContext sender) Handle an incoming message and optionally return a response.booleanLists instantiated CP/Raft groups for operator inspection.Lists active CP sessions for operator inspection.voidonPeerDisconnect(String peerId) Called by TcpServer when a peer connection is closed.static CacheNode.ParsedSeedParse a seed address supporting IPv4 (host:port) and bracketed IPv6 ([2001:db8::1]:port) forms.raftGroupForKey(String key) Resolve theRaftNodethat should propose mutations for the given key.voidRe-attempt connections to configured seed nodes.booleanremoveMigrationListener(MigrationListener listener) Remove a previously registered partition migration listener.booleanremoveSecurityInterceptor(SecurityInterceptor interceptor) Remove a previously registered security interceptor.booleanremoveSocketInterceptor(SocketInterceptor interceptor) Remove a previously registered layer-4 socket interceptor.Reset local CP subsystem state without touching AP data structures.voidStop a test-owned node without broadcasting a cluster leave.voidshutdownGracefully(Duration drainTimeout) Gracefully shut down the cache node with a configurable drain timeout.voidstart()voidStart the Raft subsystem after the node has already bootstrapped its networking stack.voidStart networking, discovery, and background services without arming the Raft election timer yet.voidstop()Gracefully shut down the cache node with a default drain timeout of 5 seconds.voidTransition this node's view of the cluster operational state.Trigger an operator Hot Backup intoClusterConfig.persistenceBackupDir().intReturns the number of verified cluster-peer connections.Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface MessageHandler
onPeerDisconnect
-
Field Details
-
PRODUCTION_ALLOW_STANDALONE_PROPERTY
- See Also:
-
WAL_IN_MEMORY_FALLBACK_PROPERTY
- See Also:
-
-
Constructor Details
-
CacheNode
Primary constructor — uses ClusterConfig. -
CacheNode
public CacheNode(ClusterConfig config, @Nullable io.micrometer.core.instrument.MeterRegistry meterRegistry) Primary constructor with custom MeterRegistry for metrics collection.- Parameters:
config- the cluster configurationmeterRegistry- the MeterRegistry for metrics (or null to use SimpleMeterRegistry)
-
CacheNode
-
-
Method Details
-
getRaftNodeApi
Narrow view ofraftNodefor callers that only need the query/submitRaftNodeApi. Used primarily so tests can depend on the interface (and supplymock(RaftNodeApi.class)) instead of the concrete Raft implementation. Prefer this overwhen concrete lifecycle methods are not needed.invalid reference
#getRaftNode() -
forceLeaderStepDown
public boolean forceLeaderStepDown()Force the local Raft leader to step down for operator-driven failover or maintenance.- Returns:
- true when this node was the leader and step-down was requested
-
triggerHotBackup
Trigger an operator Hot Backup intoClusterConfig.persistenceBackupDir(). The live WAL and snapshot stores are not modified or compacted.- Returns:
- metadata describing the backup manifest and group snapshot files
- Throws:
IOException- if writing the backup failsIllegalStateException- if no backup directory is configured
-
start
- Throws:
IOException
-
startWithoutRaft
Start networking, discovery, and background services without arming the Raft election timer yet. Intended for tests that need to form the TCP mesh before starting consensus.- Throws:
IOException
-
startRaft
public void startRaft()Start the Raft subsystem after the node has already bootstrapped its networking stack. Safe to call multiple times. -
shutdownGracefully
Gracefully shut down the cache node with a configurable drain timeout.Shutdown sequence:
- Set DRAINING state to reject new connections/requests
- Wait for in-flight requests to complete (up to drain timeout)
- Stop health checker and discovery mechanisms
- Step down from Raft leadership (if leader)
- Stop Raft consensus engine
- Stop other components (metrics, audit logger)
- Broadcast LEAVE message to cluster peers
- Close all connections and TCP server
- Parameters:
drainTimeout- the timeout to wait for in-flight requests to complete
-
shutdownForTest
public void shutdownForTest()Stop a test-owned node without broadcasting a cluster leave.Integration tests create many short-lived clusters in parallel. Treating every teardown as a production graceful leave makes surviving peers run membership-change and Raft-removal paths after the test already ended, which creates background churn that interferes with neighboring test clusters. Test shutdown is intentionally local: stop accepting work, stop Raft and background components, then close sockets.
-
stop
public void stop()Gracefully shut down the cache node with a default drain timeout of 5 seconds.This is a convenience method that calls
shutdownGracefully(Duration)with a 5-second drain timeout. It's suitable for most rolling restart scenarios. -
awaitStarted
- Throws:
InterruptedException
-
raftGroupForKey
Resolve theRaftNodethat should propose mutations for the given key.Routing semantics:
- When sharding is disabled (the default,
ClusterConfig.shardingEnabled()==false), returnsraftNode— the single v1.x consensus group. Callers that route through this helper see byte-for-byte identical behavior to rawraftNode.propose(...). - When sharding is enabled, delegates to
PartitionRouter.getRaftNodeForKey(String)which hashes the key into one ofnumGroupspartitions and returns the owningRaftGroupManagergroup (lazily creating it on first access).
This is the single chokepoint for key-based Raft-group routing. Handler code that needs to propose mutations keyed by a data-structure key should route through this method instead of touching
raftNodedirectly. Reads remain local to the registry and do not go through this helper.Per-partition snapshots: snapshots remain at the group level — each Raft group snapshots its own partitions as a single blob via
DataStructureRegistry.snapshotAllData(). Per-partition snapshots within a group (which would enable faster rebalancing of individual partitions) are deferred future work.- Parameters:
key- the data-structure key to route (must not be null)- Returns:
- the RaftNode that owns this key — never null
- Since:
- 2.0
- When sharding is disabled (the default,
-
addSecurityInterceptor
Register a global security interceptor invoked around external client operations. -
removeSecurityInterceptor
Remove a previously registered security interceptor. -
addSocketInterceptor
Register a layer-4 socket interceptor for accepted inbound TCP connections. -
removeSocketInterceptor
Remove a previously registered layer-4 socket interceptor. -
addMigrationListener
Register an operator listener for partition migration lifecycle events. -
removeMigrationListener
Remove a previously registered partition migration listener. -
getOperationalState
Return the current cluster operational state. -
getClusterVersion
Return the cluster version gate for rolling-upgrade operations. -
changeClusterVersion
Change the cluster version gate and return the previous version. -
resetCpSubsystem
Reset local CP subsystem state without touching AP data structures. -
listCpGroups
Lists instantiated CP/Raft groups for operator inspection. -
listCpSessions
Lists active CP sessions for operator inspection. -
forceCloseCpSession
Force-closes a CP session and releases its tracked resources. -
transitionOperationalState
Transition this node's view of the cluster operational state. -
handleMessage
Description copied from interface:MessageHandlerHandle an incoming message and optionally return a response.- Specified by:
handleMessagein interfaceMessageHandler- Parameters:
message- the incoming messagesender- the connection context of the sender- Returns:
- response message, or null if no response needed
-
onPeerDisconnect
Called by TcpServer when a peer connection is closed. Cleans up subscriptions and listener registrations for the disconnected peer.- Specified by:
onPeerDisconnectin interfaceMessageHandler- Parameters:
peerId- the peer ID that disconnected
-
parseSeed
Parse a seed address supporting IPv4 (host:port) and bracketed IPv6 ([2001:db8::1]:port) forms. Raw colon-bearing hosts are rejected to avoid misinterpreting IPv6 literals ashost:port. -
reconnectToSeeds
public void reconnectToSeeds()Re-attempt connections to configured seed nodes. Useful for tests that start nodes sequentially — earlier nodes will have failed to connect to later nodes that were not yet running. -
isRunning
public boolean isRunning() -
connectionCount
public int connectionCount()Returns the number of active TCP connections to peer nodes. Useful for test harnesses that need to verify mesh connectivity before starting Raft consensus. -
verifiedClusterConnectionCount
public int verifiedClusterConnectionCount()Returns the number of verified cluster-peer connections. Excludes temporary seed/pending identities that are still converging. -
aliveMemberIds
-