Raft Design
Raft is the safety boundary for LoomCache writes. Mutating commands are appended to a group log, replicated to a majority, committed, and applied before the client receives success.
Components
Section titled “Components”RaftNodeowns roles, elections, AppendEntries, commit-index advancement, snapshot install, and membership changes.RaftLogkeeps the in-memory index and delegates durable storage toPersistentRaftLog.LeaderLeaselets a leader serve linearizable reads after lease validation.RaftGroupManagerowns one or moreRaftNodeinstances.RaftInvariantChecker,RaftHealthCheck,RaftMetrics,ElectionStats,ReplicationStats, andLogStatsprovide assertions and observability.
Write path
Section titled “Write path”CacheNoderesolves the target Raft group.- The leader wraps the message in a
LogEntry. - The entry is appended locally and replicated to followers.
- The leader advances
commitIndexafter majority acknowledgement. - The applier mutates the state machine and releases the client response.
Read path
Section titled “Read path”Linearizable reads route to the leader. The leader captures the current commit index, validates its lease, waits for local apply to catch up, and serves the read without another quorum round trip.
Invariants
Section titled “Invariants”- Client success requires a committed log entry.
- Followers cannot acknowledge conflicting entries at the same term/index.
- Leaders cannot serve linearizable reads after lease expiry.
- Membership changes go through joint-consensus entries.
- Snapshot install preserves term, vote, commit index, and state-machine data.
Failure behavior
Section titled “Failure behavior”Minority partitions cannot elect a leader or commit writes. A failed leader stalls writes until a majority elects a replacement. Lagging followers catch up through AppendEntries or snapshot install. Corrupt durable state fails closed during startup validation.
Verification
Section titled “Verification”Raft safety, replication, snapshot, log consistency, linearizability, split-brain, network partition, and failover suites exercise this layer. Operators watch leader count, term churn, commit latency, append latency, replication lag, log size, and snapshot-install signals.