Class ReplicationManager
When a write operation arrives at the primary owner of a key: 1. Apply the write locally 2. Send REPL_SYNC to the backup node (next clockwise on the hash ring) 3. Wait for REPL_ACK from backup (with timeout) 4. Only then acknowledge the client
This guarantees that committed writes survive a single node failure, because both the primary and backup have the data before the client gets a response.
If the backup is unreachable, the write still succeeds (availability over strict consistency for single-replica failure). The inconsistency is logged and can trigger a re-sync when the backup recovers.
Threading: REPL_SYNC is sent on the caller's virtual thread and blocks until REPL_ACK arrives or the timeout expires.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final recordHolds parsed replication data from a REPL_SYNC message. -
Constructor Summary
ConstructorsConstructorDescriptionReplicationManager(String nodeId, int instanceNumber, @Nullable TcpServer tcpServer, ConsistentHashRing hashRing) Initialize the ReplicationManager. -
Method Summary
Modifier and TypeMethodDescriptionbuildAckResponse(int correlationId, boolean success) Build a REPL_ACK response message.@Nullable StringgetPrimaryFor(String key) Get the primary owner of a key.voidHandle an incoming REPL_ACK message (primary side).booleanisBackupFor(String key) Check if this node is the backup owner of a key.booleanisPrimaryFor(String key) Check if this node is the primary owner of a key.@Nullable ReplicationManager.ReplicationDataHandle an incoming REPL_SYNC message (backup side).booleanreplicateSync(String key, String mapName, String operation, byte[] keyBytes, byte @Nullable [] valueBytes) Replicate a write operation to the backup node synchronously.voidshutdown()Shut down the background cleanup executor.
-
Constructor Details
-
ReplicationManager
public ReplicationManager(String nodeId, int instanceNumber, @Nullable TcpServer tcpServer, ConsistentHashRing hashRing) Initialize the ReplicationManager.- Parameters:
nodeId- The node identifier (must not be null)instanceNumber- The instance number for loggingtcpServer- The TCP server for replication messages (must not be null)hashRing- The hash ring for key partitioning (must not be null)
-
-
Method Details
-
shutdown
public void shutdown()Shut down the background cleanup executor. Should be called when the manager is no longer needed (e.g., during node shutdown). -
replicateSync
public boolean replicateSync(String key, String mapName, String operation, byte[] keyBytes, byte @Nullable [] valueBytes) Replicate a write operation to the backup node synchronously.- Parameters:
key- the cache key being written (must not be null)mapName- the name of the distributed map (must not be null)operation- the operation type (PUT, DELETE, etc., must not be null)keyBytes- serialized key (must not be null)valueBytes- serialized value (null for DELETE)- Returns:
- true if backup acknowledged, false if backup unreachable/timeout
-
parseReplicationData
Handle an incoming REPL_SYNC message (backup side). Parse the operation and return a callback-ready ReplicationData.- Parameters:
msg- the replication message (must not be null)Idempotency: if the correlationId was already processed (e.g. primary retried after a lost ACK), this method returns
nullso the caller skips the duplicate write but can still send an ACK.- Returns:
- ReplicationData if parsed successfully and not a duplicate, null otherwise
-
buildAckResponse
Build a REPL_ACK response message. -
handleAck
Handle an incoming REPL_ACK message (primary side). Completes the pending future. -
isPrimaryFor
Check if this node is the primary owner of a key. -
isBackupFor
Check if this node is the backup owner of a key. -
getPrimaryFor
-