Class WalWriter
- All Implemented Interfaces:
AutoCloseable
Format per entry: [length(4 bytes)][term(8)][index(8)][type(1)][payload(variable)][checksum(4 CRC32)] New entries set the sign bit on the stored length to mark the explicit-type layout; legacy entries keep a positive length and are decoded with the older payload-only format.
Features: - Thread-safe with ReentrantLock for write ordering - FileChannel for sequential writes - Batch write support via appendBatch() - Configurable batch coalescing (size/timeout) - Comprehensive statistics tracking - File rotation when maxFileSize exceeded - Truncation support for compaction - SLF4J logging at DEBUG/INFO/WARN levels
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidAppend a log entry to the Write-Ahead Log.voidappendBatch(List<LogEntry> entries) Append multiple log entries in a single batch operation.voidForce an fsync and record the current checkpoint position.voidClear the WAL error state (typically after manual intervention/recovery).voidclose()Close the WAL writer and release resources.voidDisable GZip compression for WAL entries.voidDisable fsync batching.voidEnable GZip compression for WAL entries.voidenableFsyncBatching(long batchWindowMs, int batchThreshold) Enable fsync batching with configurable batch window and threshold.longEstimate the current WAL file size in bytes.voidFlush all pending entries in the fsync batch immediately.Get a list of all recorded checkpoint positions.longGet the number of entries written to this WAL.@Nullable StringGet the last WAL error message.longGet the timestamp when the last WAL error occurred (System.nanoTime()).getStats()Get current statistics tracked by this WAL writer.longGet the current size of the WAL file in bytes.Get current writer statistics including compression metrics.booleanCheck whether the disk is in a healthy state for writes.booleanCheck if the WAL is in an error state.voidsetDiskSpaceHardLimitMb(long limitMb) Set the disk space hard limit.voidsetDiskSpaceWarningThresholdMb(long thresholdMb) Set the disk space warning threshold.voidSet the durability guarantee level for WAL operations.voidsetMaxFileSizeBytes(long maxSizeBytes) Set the maximum WAL file size for rotation.voidsetSyncOnCommit(boolean enabled) Set whether to sync on commit.voidsync()Force all pending WAL writes to disk via fsync (durable persistence).voidTruncate the WAL file by atomically rewriting with only retained entries.voidtruncateAfterIndex(long afterIndex) Truncate WAL entries after a given index.voidwriteBatchAtomic(List<byte[]> entries) Write multiple entries atomically, all or nothing.protected intwriteBufferFully(FileChannel targetChannel, ByteBuffer buffer)
-
Constructor Details
-
WalWriter
- Throws:
IOException
-
-
Method Details
-
enableFsyncBatching
public void enableFsyncBatching(long batchWindowMs, int batchThreshold) Enable fsync batching with configurable batch window and threshold. When enabled, fsync calls are batched and deferred until either: 1. The batch window expires (default 10ms), or 2. The pending entries count exceeds the threshold (default 100)Note: This should be called before appending entries for best performance.
- Parameters:
batchWindowMs- Time window for batching (milliseconds)batchThreshold- Maximum entries to batch before forcing fsync- Throws:
IllegalStateException- if writer is closed
-
disableFsyncBatching
public void disableFsyncBatching()Disable fsync batching. All subsequent sync() calls will execute immediately. -
flushBatchedFsync
Flush all pending entries in the fsync batch immediately. This should be called during shutdown or when forcing immediate durability.- Throws:
IOException- if fsync fails
-
getWalSizeBytes
public long getWalSizeBytes()Get the current size of the WAL file in bytes.- Returns:
- Size of the WAL file, or 0 if file does not exist
-
isDiskHealthy
public boolean isDiskHealthy()Check whether the disk is in a healthy state for writes.Returns false if disk space has fallen below the hard limit, true otherwise.
- Returns:
- true if disk is healthy, false if diskFull flag is set
-
setDiskSpaceWarningThresholdMb
public void setDiskSpaceWarningThresholdMb(long thresholdMb) Set the disk space warning threshold. When available disk space drops below this value, a CRITICAL warning is logged.Default: 100 MB
- Parameters:
thresholdMb- Warning threshold in megabytes
-
setDiskSpaceHardLimitMb
public void setDiskSpaceHardLimitMb(long limitMb) Set the disk space hard limit. When available disk space drops below this value, writes are refused.Default: 10 MB
- Parameters:
limitMb- Hard limit in megabytes
-
append
Append a log entry to the Write-Ahead Log.Format per entry: [length(4)][term(8)][index(8)][type(1)][payload(variable)][checksum(4)]
Entries are written sequentially via FileChannel. Before writing, disk health is checked to prevent writes when disk space is critically low.
Entries are NOT automatically fsynced unless sync() is called afterward. Thread-safe via ReentrantLock.
- Parameters:
entry- The log entry to append- Throws:
IOException- If write fails or disk health check failsIllegalStateException- If writer is closed, disk is full, or in error stateNullPointerException- If entry is null
-
sync
Force all pending WAL writes to disk via fsync (durable persistence).Behavior depends on durability level: - NONE: Returns immediately without any fsync (no durability) - WRITE: Returns immediately; data is already in the OS page cache from the preceding FileChannel.write(). NO fsync is performed — per the contract in
DurabilityGuarantee.WRITE, data may be lost on kernel/power loss but survives an application-only crash. This is the whole point of the WRITE tier; historically this path incorrectly calledchannel.force(false), making WRITE behave identically to FSYNC and silently erasing its latency advantage. (round-8 fix) - FSYNC: Behavior depends on batching mode (full durability) - If batching ENABLED: Defers fsync until batch window expires (default 10ms) or pending count exceeds threshold (default 100), to amortize fsync cost. - If batching DISABLED: Executes fsync immediately (safe but slower).Called by Raft leader after committing entries to ensure write durability before acknowledging to clients.
Thread-safe via ReentrantLock.
- Throws:
IOException- If fsync failsIllegalStateException- If writer is closed or in error state
-
getStats
Get current statistics tracked by this WAL writer.- Returns:
- WalStats record with current metrics
-
getEntryCount
public long getEntryCount()Get the number of entries written to this WAL.- Returns:
- Total count of entries appended since writer creation
-
estimateFileSize
public long estimateFileSize()Estimate the current WAL file size in bytes.Returns the actual file size if file exists, or accumulated totalBytesWritten if not yet flushed.
- Returns:
- Estimated file size in bytes
-
appendBatch
Append multiple log entries in a single batch operation.All entries are written in a single locked section, improving efficiency for bulk operations. Disk health check is performed once before the batch.
Statistics (totalBytesWritten, totalEntriesWritten) are updated atomically. Entries are NOT automatically fsynced; call sync() afterward if durability is needed.
- Parameters:
entries- The list of entries to append- Throws:
IOException- If write fails or disk health check failsIllegalStateException- If writer is closedNullPointerException- If entries list is null
-
truncateAfterIndex
Truncate WAL entries after a given index.Atomically rewrites the WAL file to contain only entries with index invalid input: '<'= afterIndex. This operation is useful during log compaction to discard old entries.
Process: 1. Close current WAL file 2. Create temporary file with retained entries 3. Atomically move temp file over original 4. Reopen file for continued writes
- Parameters:
afterIndex- Truncate entries with index > this value- Throws:
IOException- If truncation failsIllegalStateException- If writer is closed
-
setMaxFileSizeBytes
public void setMaxFileSizeBytes(long maxSizeBytes) Set the maximum WAL file size for rotation.When file size exceeds this threshold, a new rotated file is created. Set to 0 to disable rotation (unlimited file size).
- Parameters:
maxSizeBytes- Maximum size in bytes, or 0 for unlimited
-
enableCompression
public void enableCompression()Enable GZip compression for WAL entries.When enabled, entry data is compressed before writing to disk. This reduces disk I/O and storage requirements at a CPU cost.
-
disableCompression
public void disableCompression()Disable GZip compression for WAL entries.When disabled, entries are written uncompressed. Subsequent writes will not use compression.
-
checkpoint
Force an fsync and record the current checkpoint position.Checkpoints are used to track recovery points in the WAL.
- Throws:
IOException- if fsync failsIllegalStateException- if writer is closed
-
getCheckpoints
-
writeBatchAtomic
Write multiple entries atomically, all or nothing.If any entry fails to write, the entire batch is rolled back via truncation to the starting position. This ensures atomic semantics.
- Parameters:
entries- List of byte arrays to write- Throws:
IOException- if write fails or rollback failsIllegalStateException- if writer is closed
-
getWriterStats
Get current writer statistics including compression metrics.- Returns:
- WalWriterStats snapshot
-
setDurabilityLevel
Set the durability guarantee level for WAL operations.This controls how strictly the system enforces persistence: - NONE: No persistence enforcement (fastest, least safe) - WRITE: Entries must be written to disk (buffered, moderate safety) - FSYNC: Entries must be fsynced to disk (slowest, most safe)
- Parameters:
level- The desired durability level (must not be null)- Throws:
NullPointerException- if level is null
-
isInErrorState
public boolean isInErrorState()Check if the WAL is in an error state.Returns true if a write or sync operation has failed, indicating that the node should not acknowledge further operations until the error is resolved.
- Returns:
- true if WAL error has occurred, false otherwise
-
getLastWalError
Get the last WAL error message.- Returns:
- The error message, or null if no error has occurred
-
getLastWalErrorTime
public long getLastWalErrorTime()Get the timestamp when the last WAL error occurred (System.nanoTime()).- Returns:
- The nanotime of last error, or 0 if no error has occurred
-
clearErrorState
public void clearErrorState()Clear the WAL error state (typically after manual intervention/recovery).Thread-safe via ReentrantLock.
-
close
Close the WAL writer and release resources.Performs the following cleanup: 1. Flushes any pending fsync batch to disk 2. Cancels any scheduled batch timer tasks 3. Shuts down the batch executor (with 1-second timeout) 4. Closes the underlying FileChannel
The writer becomes unusable after closing. Thread-safe via ReentrantLock.
- Specified by:
closein interfaceAutoCloseable- Throws:
IOException- If fsync or channel close fails
-
setSyncOnCommit
public void setSyncOnCommit(boolean enabled) Set whether to sync on commit. Default is true for durability. Set to false only if caller manages fsync explicitly.- Parameters:
enabled- true to fsync on commit, false for performance (risky!)
-
truncate
Truncate the WAL file by atomically rewriting with only retained entries.This operation bounds WAL disk usage by discarding old entries that have been compacted or snapshotted. The operation is atomic to prevent data loss.
Process: 1. Close the current WAL file 2. Create a temporary WAL file with entries to keep (with checksums) 3. Atomically move the temp file over the old file 4. Re-open the file in append mode for subsequent writes
Thread-safe via ReentrantLock (held throughout the entire operation).
- Parameters:
entriesToKeep- List of entries to retain in the truncated WAL- Throws:
IOException- If any I/O operation fails (temp file creation, write, move, reopen)IllegalStateException- If writer is closed
-
writeBufferFully
- Throws:
IOException
-