Raft is a consensus algorithm designed to be understandable. It decomposes consensus into three subproblems: leader election, log replication, and safety. It provides the same guarantees as Paxos but with a clearer structure. One leader per term. The leader appends entries to its log and replicates them. An entry is committed when a majority of nodes have it.
Leader election
Nodes start as followers. If a follower receives no heartbeat from a leader within a timeout, it becomes a candidate and starts an election. It increments the term, votes for itself, and requests votes from others. A node wins if it gets a majority. If two candidates split the vote, a new election starts with a new term. Randomized timeouts prevent repeated ties.
The leader receives client requests and appends them to its log. It sends AppendEntries RPCs to all followers. Once a majority has the entry, the leader commits it and responds to the client. Followers apply committed entries to their state machines in log order. If a follower is behind, the leader sends missing entries.
Scheme
; Raft log replication.; Leader appends, replicates, commits on majority.
(define leader-log (list))
(define f1-log (list))
(define f2-log (list))
(define commit-index 0)
(define (append-entry entry)
(set! leader-log (append leader-log (list entry)))
(display "Leader appends: ") (display entry) (newline))
(define (replicate-to-follower follower-name)
; Send all entries after follower's last index
(cond
((equal? follower-name "F1")
(set! f1-log leader-log))
((equal? follower-name "F2")
(set! f2-log leader-log)))
(display " Replicated to ") (display follower-name) (newline))
(define (try-commit index)
; Count how many have entry at this index
(let ((count (+ 1; leader always has it
(if (>= (length f1-log) index) 10)
(if (>= (length f2-log) index) 10))))
(if (>= count 2)
(begin (set! commit-index index)
(display "Committed index ") (display index) (newline))
(display "Not yet committed"))))
(append-entry "x=1")
(replicate-to-follower "F1")
(replicate-to-follower "F2")
(try-commit 1)
(append-entry "y=2")
(replicate-to-follower "F1")
; F2 is slow, has not received y=2 yet
(try-commit 2) ; 2 of 3 have it -> committed
Python
# Raft log replicationclass RaftNode:
def __init__(self, name):
self.name = name
self.log = []
leader = RaftNode("Leader")
followers = [RaftNode("F1"), RaftNode("F2")]
commit_index = 0def append_entry(entry):
leader.log.append(entry)
print(f"Leader appends: {entry}")
def replicate(follower):
follower.log = leader.log[:]
print(f" Replicated to {follower.name}")
def try_commit(index):
global commit_index
count = 1 + sum(1for f in followers iflen(f.log) >= index)
if count >= 2:
commit_index = index
print(f"Committed index {index}")
else:
print(f"Not yet committed (only {count}/3)")
append_entry("x=1")
for f in followers:
replicate(f)
try_commit(1)
append_entry("y=2")
replicate(followers[0]) # F2 is slow
try_commit(2)
Safety and liveness
Safety: if an entry is committed, no future leader will have a different entry at that index. Raft guarantees this via the election restriction: a candidate must have a log at least as up-to-date as a majority. Liveness: the system makes progress as long as a majority of nodes and the network are working. Randomized election timeouts prevent livelock.
Neighbors
Cross-references
🌐 Ch.4 Consensus — the general consensus problem and FLP impossibility