Raft Clustering

LoomCache is fundamentally a distributed system. A single node dying should never result in lost data. To govern this distributed state, LoomCache implements the Raft Consensus Algorithm — the same protocol used by etcd, CockroachDB, and TiKV.

Raft Consensus Flow

Waiting for client request...

Client

Leader Node

WAL Term

Follower 1

Follower 2

The Leader Role

At any given instance, one node in the LoomCache cluster is the Leader. The Leader is responsible for receiving write requests (e.g., PUT, DELETE). It writes the request to its local log and immediately broadcasts an AppendEntries RPC to the Followers.

Clients automatically discover the leader via LeaderTracker. If a write hits a follower, the client receives a RESPONSE_REDIRECT and transparently reroutes to the current leader.

Log Replication & Quorum

A write is not considered “committed” until the Leader receives acknowledging responses from a strict majority of the cluster (the Quorum).

[!CAUTION] If a 5-node cluster suffers a 2-node failure, the remaining 3 nodes maintain quorum and continue serving writes. If 3 nodes fail, the remaining 2 lose quorum and will block writes to prevent “Split-Brain” data corruption.

Single Raft Group

LoomCache uses a single Raft group where every node holds all data — the same model as etcd and ZooKeeper. All 7 data structures (Map, Queue, Set, SortedSet, Topic, Lock, Counter) replicate through the same Raft log. Three-node clusters are the recommended minimum (quorum = 2).

Leader Election

LoomCache hardens standard Raft with several safety mechanisms:

Pre-vote phase: Before starting a real election, candidates run a pre-vote round. This prevents partitioned nodes from incrementing the global term number and disrupting stable clusters.
Randomized timeouts: Election timeouts are randomized between 300–600ms (configurable) to prevent simultaneous elections across the cluster.
Leader lease: The leader maintains a lease based on recent heartbeat acknowledgments from a majority. This enables fast linearizable reads without quorum round-trips on every GET.
TimeoutNow RPC: Explicit leadership transfer for graceful shutdowns and rolling upgrades.

Timing Configuration

Parameter	Default	Purpose
Heartbeat interval	100ms	Leader → Follower keepalive
Election timeout (min)	300ms	Minimum wait before election
Election timeout (max)	600ms	Maximum wait (randomized)
Replication interval	50ms	Log replication check frequency
Max entries per append	100	Batch size for AppendEntries RPC

Write-Ahead Log (WAL)

Every committed entry is persisted to disk before acknowledgment:

WalWriter appends the entry: 4-byte length + 8-byte term + 8-byte index + command + 4-byte CRC32
fsync() is enforced after each commit (configurable)
Snapshots are created every 10,000 entries (Kryo-serialized state of all data structures)
On restart: load snapshot → replay WAL → rejoin cluster

[!NOTE] Place WAL on SSD/NVMe storage for optimal write latency. fsync typically adds 1–10ms on SSD.

Linearizability

LoomCache provides strong consistency (linearizability). This means that once a write is acknowledged to a client, all subsequent reads across the cluster — regardless of leader elections or network partitions — are guaranteed to reflect that write.

Read consistency options:

Linearizable: Route to leader with valid lease — fast, no quorum round-trip
Eventual: Route to any follower — stale by ≤100ms, enables read scaling

We actively enforce and test linearizability using custom chaos testing harnesses in our CI pipelines, including network partition injection, leader kills, and concurrent write verification across 11,600+ test methods.

Dynamic Membership

LoomCache supports live cluster scaling via Raft configuration changes:

AddServer: New node joins, receives snapshot + WAL catch-up, then participates in quorum
RemoveServer: Node is removed from voting membership; partitions are migrated first
No downtime required — the cluster continues serving reads and writes during membership changes