Central Limit Theorem
Grinstead & Snell · GFDL · PDF
Standardize the sum of n independent random variables: subtract the mean, divide by σ√n. As n grows, the distribution converges to N(0, 1). This is why the normal distribution appears everywhere.
The standardized sum
Let X₁, X₂, …, Xₙ be independent random variables with mean μ and variance σ². Their sum Sₙ = X₁ + … + Xₙ has mean nμ and variance nσ². The standardized sum is Sₙ* = (Sₙ − nμ) / (σ√n). The CLT says the distribution of Sₙ* converges to the standard normal.
Continuity correction
When approximating a discrete distribution with the continuous normal, shift by ½. To find P(Sₙ = k), compute P(k − ½ < Sₙ < k + ½) under the normal curve. This continuity correction dramatically improves accuracy for small n.
Why the normal distribution?
The normal distribution is not arbitrary. It is the maximum-entropy distribution for a given mean and variance. If all you know about a quantity is its mean and variance, the least-presumptuous distribution is normal. The CLT makes this concrete: sums of independent variables lose all structure except mean and variance, so what remains is the max-entropy distribution.
Notation reference
| Textbook | Scheme | Meaning |
|---|---|---|
| Sₙ = X₁ + … + Xₙ | (loop ... (+ total (roll-die))) | Sum of n trials |
| Sₙ* = (Sₙ − nμ) / σ√n | (standardized-sum n) | Standardized sum |
| Φ(z) | (phi z) | Standard normal CDF |
| N(0, 1) | standard normal | Mean 0, variance 1 |
Neighbors
Probability chapters
- 🎰 Ch 8 — Law of Large Numbers (convergence of averages)
- 🎰 Ch 10 — Generating Functions (another route to the CLT proof)
- 🎰 Ch 7 — Sums of Random Variables (convolution)
Connections
- Baez & Fritz 2011 — entropy characterization: the normal maximizes entropy for fixed mean and variance
- ∞ Lebl Ch.2 Sequences — the convergence theory underlying the CLT
Central Limit Theorem
Normal distribution
Translation notes
The standardized sum here uses a fixed pseudo-random generator, so the "randomness" is deterministic. A real demonstration would run thousands of trials and plot the histogram. The continuity correction uses an approximation to Φ(z) from Abramowitz & Stegun, accurate to about 5 decimal places. The textbook proves the CLT via moment generating functions (Chapter 10). The entropy characterization follows from constrained optimization with Lagrange multipliers.
The Handshake