🌐 Distributed Systems
Based on Wikipedia (CC BY-SA 4.0) and Cambridge CS course notes (CC BY-SA).
What happens when you split a program across machines that can fail independently, have no shared clock, and communicate by sending messages that might get lost.
| Chapter | |||
|---|---|---|---|
| 1. | What Makes It Hard | Network partitions, partial failure, no global clock, Byzantine faults | 🌐 |
| 2. | Time and Clocks | Physical clocks drift; logical clocks capture causality without synchronization | 🌐 |
| 3. | Replication | Copy data across nodes for fault tolerance, then fight consistency | 🌐 |
| 4. | Consensus | Getting nodes to agree on a value, and why it is impossible in general | 🌐 |
| 5. | Raft | Leader election and log replication made understandable | 🌐 |
| 6. | Byzantine Fault Tolerance | When nodes can lie: 3f+1 to tolerate f traitors | 🌐 |
| 7. | CAP Theorem | Consistency, availability, partition tolerance: pick two | 🌐 |
| 8. | Distributed Transactions | Two-phase commit, three-phase commit, and the saga pattern | 🌐 |
| 9. | Gossip Protocols | Epidemic dissemination: tell two friends, who tell two friends | 🌐 |
| 10. | CRDTs | Eventual consistency without consensus, via join-semilattices | 🌐 |
📺 Video lectures: Martin Kleppmann: Distributed Systems
Neighbors
- ⚙ Algorithms — consensus algorithms are graph algorithms on distributed state
- 🎰 Probability — probabilistic guarantees and gossip protocols
- 🗄 Databases — distributed transactions are both fields
- 🔐 Cryptography — Byzantine fault tolerance and authenticated channels