CAP Theorem

Brewer 2000, Gilbert & Lynch 2002 · Wikipedia

The CAP theorem: a distributed data store can provide at most two of three guarantees: Consistency (every read gets the latest write), Availability (every request gets a response), Partition tolerance (the system works despite network partitions). Since partitions happen in practice, the real choice is between C and A during a partition.

The three properties

Consistency: linearizability. Every read returns the value of the most recent write. Availability: every non-failing node returns a response for every request. Partition tolerance: the system continues to operate despite arbitrary message loss between nodes. In any real network, partitions happen, so you must tolerate them. The theorem says you then must choose: block during a partition (CP) or return possibly stale data (AP).

Scheme

; CAP: during a partition, choose C or A.
; CP system: refuse to respond if unsure about latest value.
; AP system: respond with whatever you have.

(define partitioned #t)
(define local-value 41)   ; stale: leader has 42

(define (cp-read)
  (if partitioned
      "ERROR: cannot guarantee consistency"
      local-value))

(define (ap-read)
  local-value)  ; may be stale, but always responds

(display "Partition active") (newline)
(display "CP system read: ") (display (cp-read)) (newline)
(display "AP system read: ") (display (ap-read)) (newline)
(display "CP chose consistency over availability.") (newline)
(display "AP chose availability over consistency.")

The real choice: latency vs consistency

CAP is often oversimplified. The deeper insight (from Abadi's PACELC extension): even when there is no partition, there is a tradeoff between latency and consistency. Synchronous replication is consistent but slow. Asynchronous replication is fast but eventually consistent. The partition case is just the extreme version of this tradeoff.

Scheme

; PACELC: even without partitions, latency vs consistency.
; P -> choose A or C
; else -> choose L (latency) or C (consistency)

(define (pacelc-classify system)
  (cond
    ((equal? system "DynamoDB") "PA/EL - available + low latency")
    ((equal? system "MongoDB")  "PC/EC - consistent always")
    ((equal? system "Cassandra") "PA/EL - available + low latency")
    ((equal? system "Spanner")  "PC/EL - consistent + low latency (TrueTime)")
    (else "unknown")))

(display "DynamoDB: ") (display (pacelc-classify "DynamoDB")) (newline)
(display "MongoDB:  ") (display (pacelc-classify "MongoDB")) (newline)
(display "Cassandra: ") (display (pacelc-classify "Cassandra")) (newline)
(display "Spanner:  ") (display (pacelc-classify "Spanner"))

Neighbors

Cross-references

🌐 Ch.3 Replication — the consistency models that CAP forces you to choose between

← Byzantine Fault Tolerance by june.kim Distributed Transactions →