Raft Clustering
LoomCache is fundamentally a distributed system. A single node dying should never result in lost data. To govern this distributed state, LoomCache implements the Raft Consensus Algorithm — the same protocol used by etcd, CockroachDB, and TiKV.
Raft Consensus Flow
Waiting for client request...
The Leader Role
Section titled “The Leader Role”At any given instance, one node in the LoomCache cluster is the Leader. The Leader is responsible for receiving write requests (e.g., PUT, DELETE). It writes the request to its local log and immediately broadcasts an AppendEntries RPC to the Followers.
Clients automatically discover the leader via LeaderTracker. If a write hits a follower, the client receives a RESPONSE_REDIRECT and transparently reroutes to the current leader.
Log Replication & Quorum
Section titled “Log Replication & Quorum”A write is not considered “committed” until the Leader receives acknowledging responses from a strict majority of the cluster (the Quorum).
[!CAUTION] If a 5-node cluster suffers a 2-node failure, the remaining 3 nodes maintain quorum and continue serving writes. If 3 nodes fail, the remaining 2 lose quorum and will block writes to prevent “Split-Brain” data corruption.
Single Raft Group
Section titled “Single Raft Group”LoomCache uses a single Raft group where every node holds all data — the same model as etcd and ZooKeeper. All 7 data structures (Map, Queue, Set, SortedSet, Topic, Lock, Counter) replicate through the same Raft log. Three-node clusters are the recommended minimum (quorum = 2).
Leader Election
Section titled “Leader Election”LoomCache hardens standard Raft with several safety mechanisms:
- Pre-vote phase: Before starting a real election, candidates run a pre-vote round. This prevents partitioned nodes from incrementing the global term number and disrupting stable clusters.
- Randomized timeouts: Election timeouts are randomized between 300–600ms (configurable) to prevent simultaneous elections across the cluster.
- Leader lease: The leader maintains a lease based on recent heartbeat acknowledgments from a majority. This enables fast linearizable reads without quorum round-trips on every GET.
- TimeoutNow RPC: Explicit leadership transfer for graceful shutdowns and rolling upgrades.
Timing Configuration
Section titled “Timing Configuration”| Parameter | Default | Purpose |
|---|---|---|
| Heartbeat interval | 100ms | Leader → Follower keepalive |
| Election timeout (min) | 300ms | Minimum wait before election |
| Election timeout (max) | 600ms | Maximum wait (randomized) |
| Replication interval | 50ms | Log replication check frequency |
| Max entries per append | 100 | Batch size for AppendEntries RPC |
Write-Ahead Log (WAL)
Section titled “Write-Ahead Log (WAL)”Every committed entry is persisted to disk before acknowledgment:
WalWriterappends the entry: 4-byte length + 8-byte term + 8-byte index + command + 4-byte CRC32fsync()is enforced after each commit (configurable)- Snapshots are created every 10,000 entries (Kryo-serialized state of all data structures)
- On restart: load snapshot → replay WAL → rejoin cluster
[!NOTE] Place WAL on SSD/NVMe storage for optimal write latency. fsync typically adds 1–10ms on SSD.
Linearizability
Section titled “Linearizability”LoomCache provides strong consistency (linearizability). This means that once a write is acknowledged to a client, all subsequent reads across the cluster — regardless of leader elections or network partitions — are guaranteed to reflect that write.
Read consistency options:
- Linearizable: Route to leader with valid lease — fast, no quorum round-trip
- Eventual: Route to any follower — stale by ≤100ms, enables read scaling
We actively enforce and test linearizability using custom chaos testing harnesses in our CI pipelines, including network partition injection, leader kills, and concurrent write verification across 11,600+ test methods.
Dynamic Membership
Section titled “Dynamic Membership”LoomCache supports live cluster scaling via Raft configuration changes:
- AddServer: New node joins, receives snapshot + WAL catch-up, then participates in quorum
- RemoveServer: Node is removed from voting membership; partitions are migrated first
- No downtime required — the cluster continues serving reads and writes during membership changes