← back to probability

Central Limit Theorem

Grinstead & Snell · GFDL · PDF

Standardize the sum of n independent random variables: subtract the mean, divide by σ√n. As n grows, the distribution converges to N(0, 1). This is why the normal distribution appears everywhere.

The standardized sum

Let X₁, X₂, …, Xₙ be independent random variables with mean μ and variance σ². Their sum Sₙ = X₁ + … + Xₙ has mean nμ and variance nσ². The standardized sum is Sₙ* = (Sₙ − nμ) / (σ√n). The CLT says the distribution of Sₙ* converges to the standard normal.

Sₙ* standardized sum N(0,1) sample means -3 -1 0 1 3
Scheme

Continuity correction

When approximating a discrete distribution with the continuous normal, shift by ½. To find P(Sₙ = k), compute P(k − ½ < Sₙ < k + ½) under the normal curve. This continuity correction dramatically improves accuracy for small n.

Scheme

Why the normal distribution?

The normal distribution is not arbitrary. It is the maximum-entropy distribution for a given mean and variance. If all you know about a quantity is its mean and variance, the least-presumptuous distribution is normal. The CLT makes this concrete: sums of independent variables lose all structure except mean and variance, so what remains is the max-entropy distribution.

Scheme

Notation reference

Textbook Scheme Meaning
Sₙ = X₁ + … + Xₙ(loop ... (+ total (roll-die)))Sum of n trials
Sₙ* = (Sₙ − nμ) / σ√n(standardized-sum n)Standardized sum
Φ(z)(phi z)Standard normal CDF
N(0, 1)standard normalMean 0, variance 1
Neighbors

Probability chapters

  • 🎰 Ch 8 — Law of Large Numbers (convergence of averages)
  • 🎰 Ch 10 — Generating Functions (another route to the CLT proof)
  • 🎰 Ch 7 — Sums of Random Variables (convolution)

Connections

Translation notes

The standardized sum here uses a fixed pseudo-random generator, so the "randomness" is deterministic. A real demonstration would run thousands of trials and plot the histogram. The continuity correction uses an approximation to Φ(z) from Abramowitz & Stegun, accurate to about 5 decimal places. The textbook proves the CLT via moment generating functions (Chapter 10). The entropy characterization follows from constrained optimization with Lagrange multipliers.

Ready for the real thing? Read Chapter 9 of Grinstead & Snell.

jkThe Handshake