Markov Chains
Grinstead & Snell · GFDL · PDF
A Markov chain is a random process where the next state depends only on the current state, not the history. The transition matrix P encodes all the dynamics. Multiply P by itself to see the future. Ergodic chains converge to a unique stationary distribution.
Transition matrix
A Markov chain on states 1, …, n is defined by its transition matrix P, where Pᵢⱼ = P(next state = j | current state = i). Each row sums to 1. The state after k steps: multiply the initial distribution by Pᵏ.
Stationary distribution
A distribution π is stationary if πP = π. It is a left eigenvector of P with eigenvalue 1. For an ergodic chain (irreducible and aperiodic), the stationary distribution is unique, and every initial distribution converges to it. This is the ergodic theorem.
Absorbing chains
A state is absorbing if, once entered, the chain never leaves. An absorbing chain has at least one absorbing state and every state can reach one. The key question: starting from a transient state, how many steps until absorption? The fundamental matrix N = (I − Q)⁻¹ answers this, where Q is the submatrix of transitions among transient states.
Notation reference
| Textbook | Scheme | Meaning |
|---|---|---|
| Pᵢⱼ | (list-ref (list-ref P i) j) | Transition probability i → j |
| πP = π | (dot pi-vec col) | Stationary distribution |
| N = (I − Q)⁻¹ | fundamental matrix | Expected visits to transient states |
| Pᵏ | (loop ... (step d P) ...) | k-step transition matrix |
Neighbors
Probability chapters
- 🎰 Ch 12 — Random Walks (Markov chains on the integers)
- 🎰 Ch 4 — Conditional Probability (the Markov property is a conditional independence statement)
- 🤖 ML Ch.10 — sequence models extend Markov chains with learned transitions
- 🏛️ Soar — cognitive cycles have a Markov structure
- ⚙ Algorithms Ch.7 — shortest paths on graphs generalize Markov chain state transitions
Connections
- 🍞 Fritz 2020 — Markov categories: the abstract structure where Markov chains live as morphisms
- 🍞 Staton 2025 — probabilistic programming composes Markov kernels the same way
- Panangaden 2009 — labelled Markov processes: continuous-state generalization with bisimulation
Markov chain
Absorbing Markov chain
Translation notes
The weather example is a 2-state ergodic chain. Real Markov chains can have hundreds of states, but the algebra is the same: matrix powers for multi-step transitions, eigenvector for stationary distribution. The absorbing chain computation uses the fundamental matrix, which the textbook derives via geometric series: N = I + Q + Q² + … = (I − Q)⁻¹. Fritz 2020 shows that the Markov property (future independent of past given present) is exactly the composition law in a Markov category.
The Handshake