Class ReplicationManager

java.lang.Object
com.loomcache.server.replication.ReplicationManager

public class ReplicationManager extends Object
Synchronous Primary-Backup Replication Manager.

When a write operation arrives at the primary owner of a key: 1. Apply the write locally 2. Send REPL_SYNC to the backup node (next clockwise on the hash ring) 3. Wait for REPL_ACK from backup (with timeout) 4. Only then acknowledge the client

This guarantees that committed writes survive a single node failure, because both the primary and backup have the data before the client gets a response.

If the backup is unreachable, the write still succeeds (availability over strict consistency for single-replica failure). The inconsistency is logged and can trigger a re-sync when the backup recovers.

Threading: REPL_SYNC is sent on the caller's virtual thread and blocks until REPL_ACK arrives or the timeout expires.

  • Constructor Details

    • ReplicationManager

      public ReplicationManager(String nodeId, int instanceNumber, @Nullable TcpServer tcpServer, ConsistentHashRing hashRing)
      Initialize the ReplicationManager.
      Parameters:
      nodeId - The node identifier (must not be null)
      instanceNumber - The instance number for logging
      tcpServer - The TCP server for replication messages (must not be null)
      hashRing - The hash ring for key partitioning (must not be null)
  • Method Details

    • shutdown

      public void shutdown()
      Shut down the background cleanup executor. Should be called when the manager is no longer needed (e.g., during node shutdown).
    • replicateSync

      public boolean replicateSync(String key, String mapName, String operation, byte[] keyBytes, byte @Nullable [] valueBytes)
      Replicate a write operation to the backup node synchronously.
      Parameters:
      key - the cache key being written (must not be null)
      mapName - the name of the distributed map (must not be null)
      operation - the operation type (PUT, DELETE, etc., must not be null)
      keyBytes - serialized key (must not be null)
      valueBytes - serialized value (null for DELETE)
      Returns:
      true if backup acknowledged, false if backup unreachable/timeout
    • parseReplicationData

      public @Nullable ReplicationManager.ReplicationData parseReplicationData(Message msg)
      Handle an incoming REPL_SYNC message (backup side). Parse the operation and return a callback-ready ReplicationData.
      Parameters:
      msg - the replication message (must not be null)

      Idempotency: if the correlationId was already processed (e.g. primary retried after a lost ACK), this method returns null so the caller skips the duplicate write but can still send an ACK.

      Returns:
      ReplicationData if parsed successfully and not a duplicate, null otherwise
    • buildAckResponse

      public Message buildAckResponse(int correlationId, boolean success)
      Build a REPL_ACK response message.
    • handleAck

      public void handleAck(Message msg)
      Handle an incoming REPL_ACK message (primary side). Completes the pending future.
    • isPrimaryFor

      public boolean isPrimaryFor(String key)
      Check if this node is the primary owner of a key.
    • isBackupFor

      public boolean isBackupFor(String key)
      Check if this node is the backup owner of a key.
    • getPrimaryFor

      public @Nullable String getPrimaryFor(String key)
      Get the primary owner of a key.