Skip to content

Persistence Design

Persistence makes committed Raft and state-machine data survive crashes, node replacement, and controlled backup/restore operations. It is local to each member and depends on a stable nodeId and durable dataDir.

  • WalWriter, WalReader, and WalCompactor own append-only records, CRC32 checksums, rotation, fsync, and compaction.
  • PersistentRaftLog stores Raft log entries when persistence is enabled.
  • RaftMetadataStore persists term, voted-for, and commit-index metadata.
  • SnapshotManager, SnapshotStore, SnapshotScheduler, DeltaSnapshot, and SnapshotChain manage snapshots.
  • StateMachineSnapshotManager captures registered data-structure state.
  • HotBackupManager and HotBackupScheduler produce group snapshots and manifests under the backup directory.
  1. The Raft leader appends a command to its log.
  2. Durable log and metadata updates use the configured fsync mode.
  3. The committed command applies to the in-memory state machine.
  4. WAL compaction waits until snapshots cover the compacted range.
  1. Validate metadata, WAL segments, sidecar checksums, and snapshot metadata.
  2. Load the newest valid snapshot chain.
  3. Replay records newer than the snapshot index.
  4. Reject startup when the selected recovery policy forbids the local data shape.
  • CRC and sidecar checksum validation run before replay.
  • Snapshot install cannot silently skip registered data structures.
  • WAL compaction cannot remove records not covered by a durable snapshot.
  • Quorum-loss restore must be explicit.

A single failed machine can be replaced from its durable dataDir or documented backup path if quorum survived. Corrupt local files fail closed unless the selected recovery policy allows partial local recovery. Hot Backup is point-in-time; it does not replace per-node WAL durability.

WAL durability, crash recovery, CRC validation, compaction, disk-fault, graceful restart, snapshot store, and Hot Backup tests cover this layer. Operators watch fsync latency, segment age, snapshot duration, validation errors, backup age, and startup recovery logs.