← back to distributed systems

Consensus

Fischer, Lynch, Paterson 1985 · Lamport 1998 · wpWikipedia

The consensus problem: get all non-faulty nodes to agree on a single value. The FLP impossibility result says no deterministic algorithm can guarantee consensus in an asynchronous system if even one node can crash. Paxos solves it anyway by being non-deterministic (it may not terminate, but if it does, it is correct).

The consensus problem

Each node proposes a value. All non-faulty nodes must decide the same value (agreement). The decided value must have been proposed by some node (validity). Each node decides at most once (termination, in practice). These three properties are deceptively simple. Getting all three with crash failures is the core challenge.

Scheme

FLP impossibility

wpFischer, Lynch, and Paterson proved in 1985 that no deterministic consensus protocol can guarantee both safety (agreement + validity) and liveness (termination) in an asynchronous system where even one process can crash. The proof constructs an execution where the algorithm is always one step away from deciding but never does. This is not a practical obstacle. It means you need randomization, timeouts, or partial synchrony assumptions.

Proposer Acceptor 1 Acceptor 2 Acceptor 3 Learner prepare/ accept accepted Paxos roles: propose, accept (majority), learn.

Paxos overview

wpPaxos uses two phases. Phase 1 (prepare): the proposer picks a number n, sends prepare(n) to acceptors. Acceptors promise not to accept anything below n and return any value they already accepted. Phase 2 (accept): the proposer sends accept(n, v) where v is the highest-numbered previously accepted value, or its own if none. Acceptors accept if they have not promised a higher number. A value is chosen when a majority of acceptors accept the same proposal.

Scheme
Neighbors

Cross-references

  • 🔑 Logic Ch.1 — arguments: consensus is distributed agreement, requiring valid reasoning under uncertainty
  • 🌐 Ch.5 Raft — a more understandable consensus algorithm