The CAP theorem: a distributed data store can provide at most two of three guarantees: Consistency (every read gets the latest write), Availability (every request gets a response), Partition tolerance (the system works despite network partitions). Since partitions happen in practice, the real choice is between C and A during a partition.
The three properties
Consistency: linearizability. Every read returns the value of the most recent write. Availability: every non-failing node returns a response for every request. Partition tolerance: the system continues to operate despite arbitrary message loss between nodes. In any real network, partitions happen, so you must tolerate them. The theorem says you then must choose: block during a partition (CP) or return possibly stale data (AP).
Scheme
; CAP: during a partition, choose C or A.; CP system: refuse to respond if unsure about latest value.; AP system: respond with whatever you have.
(define partitioned #t)
(define local-value 41) ; stale: leader has 42
(define (cp-read)
(if partitioned
"ERROR: cannot guarantee consistency"
local-value))
(define (ap-read)
local-value) ; may be stale, but always responds
(display "Partition active") (newline)
(display "CP system read: ") (display (cp-read)) (newline)
(display "AP system read: ") (display (ap-read)) (newline)
(display "CP chose consistency over availability.") (newline)
(display "AP chose availability over consistency.")
Python
# CAP tradeoff during partition
partitioned = True
local_value = 41# stale: leader updated to 42def cp_read():
if partitioned:
raiseException("Cannot guarantee consistency")
return local_value
def ap_read():
return local_value # stale but responsiveprint("Partition active")
try:
print(f"CP read: {cp_read()}")
exceptExceptionas e:
print(f"CP read: {e}")
print(f"AP read: {ap_read()}")
The real choice: latency vs consistency
CAP is often oversimplified. The deeper insight (from Abadi's PACELC extension): even when there is no partition, there is a tradeoff between latency and consistency. Synchronous replication is consistent but slow. Asynchronous replication is fast but eventually consistent. The partition case is just the extreme version of this tradeoff.
Scheme
; PACELC: even without partitions, latency vs consistency.; P -> choose A or C; else -> choose L (latency) or C (consistency)
(define (pacelc-classify system)
(cond
((equal? system "DynamoDB") "PA/EL - available + low latency")
((equal? system "MongoDB") "PC/EC - consistent always")
((equal? system "Cassandra") "PA/EL - available + low latency")
((equal? system "Spanner") "PC/EL - consistent + low latency (TrueTime)")
(else"unknown")))
(display "DynamoDB: ") (display (pacelc-classify "DynamoDB")) (newline)
(display "MongoDB: ") (display (pacelc-classify "MongoDB")) (newline)
(display "Cassandra: ") (display (pacelc-classify "Cassandra")) (newline)
(display "Spanner: ") (display (pacelc-classify "Spanner"))
Neighbors
Cross-references
🌐 Ch.3 Replication — the consistency models that CAP forces you to choose between