Sums of Random Variables

Grinstead & Snell · GFDL · PDF

When you add independent random variables, means add, variances add, and the distribution of the sum is the convolution of the individual distributions. The sum gets wider and more bell-shaped.

Convolution

If X and Y are independent discrete random variables, the PMF of Z = X + Y is the convolution: P(Z = z) = sum over k of P(X = k) P(Y = z - k). Each possible way the parts can add to z contributes to the total. This is why adding dice produces the familiar triangle-shaped distribution.

Scheme

; Convolution: P(Z = z) = sum_k P(X = k) * P(Y = z - k)
; Adding two dice: distribution of the sum

(define (convolve pmf-x pmf-y max-x max-y)
  ; Returns a list of (value . probability) for Z = X + Y
  (let loop ((z 2) (result '()))
    (if (> z (+ max-x max-y)) (reverse result)
        (let inner ((k 1) (prob 0))
          (if (> k max-x)
              (loop (+ z 1) (cons (cons z prob) result))
              (let ((y-val (- z k)))
                (if (or (< y-val 1) (> y-val max-y))
                    (inner (+ k 1) prob)
                    (inner (+ k 1)
                           (+ prob (* (pmf-x k) (pmf-y y-val)))))))))))

; Fair die PMF
(define (fair-die k)
  (if (and (>= k 1) (<= k 6)) (/ 1.0 6) 0))

(define two-dice (convolve fair-die fair-die 6 6))

(display "Sum of two dice:") (newline)
(for-each (lambda (pair)
  (display "P(Z=") (display (car pair)) (display ") = ")
  (display (cdr pair)) (newline))
  two-dice)

Means add

E[X + Y] = E[X] + E[Y]. This is linearity of expectation from Ch 6, and it holds whether or not X and Y are independent. Roll two dice: E[X + Y] = 3.5 + 3.5 = 7. Roll a hundred dice: E[sum] = 350.

Scheme

; E[X + Y] = E[X] + E[Y] -- always

(define (expected-die)
  (/ (+ 1 2 3 4 5 6) 6.0))

(display "E[one die] = ") (display (expected-die)) (newline)
(display "E[two dice] = ") (display (* 2 (expected-die))) (newline)
(display "E[100 dice] = ") (display (* 100 (expected-die))) (newline)

; Works for any random variables, not just dice
; E[height + weight] = E[height] + E[weight]
; even though height and weight are correlated
(display "Linearity does not require independence.")

Variances add when independent

Var(X + Y) = Var(X) + Var(Y), but only when X and Y are independent. If they are correlated, a covariance term appears: Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X,Y). For independent variables, covariance is zero, so variances add cleanly.

Scheme

; Var(X + Y) = Var(X) + Var(Y) when independent
; Var(die) = E[X^2] - (E[X])^2 = 91/6 - 12.25 = 2.9167

(define (var-die)
  (let ((e-x (/ (+ 1 2 3 4 5 6) 6.0))
        (e-x2 (/ (+ 1 4 9 16 25 36) 6.0)))
    (- e-x2 (* e-x e-x))))

(display "Var(one die) = ") (display (var-die)) (newline)
(display "Var(two dice) = ") (display (* 2 (var-die))) (newline)
(display "SD(two dice) = ") (display (sqrt (* 2 (var-die)))) (newline)
(display "Var(100 dice) = ") (display (* 100 (var-die))) (newline)
(display "SD(100 dice) = ") (display (sqrt (* 100 (var-die)))) (newline)

; SD grows as sqrt(n), not n.
; 100 dice: mean 350, SD ~17.
; Relative spread shrinks: SD/mean -> 0 as n grows.
; This is why averages get more predictable (Ch 8).

Verify by simulation

Theory says: means add, variances add. Simulation confirms it. Roll two dice many times, compute the sample mean and variance of the sum, and check that they match the formulas.

Scheme

; Verify convolution properties by brute force
; Enumerate all 36 outcomes of two dice

(define (all-sums)
  (let loop-x ((x 1) (sums '()))
    (if (> x 6) sums
        (let loop-y ((y 1) (s sums))
          (if (> y 6) (loop-x (+ x 1) s)
              (loop-y (+ y 1) (cons (+ x y) s)))))))

(define sums (all-sums))
(define n (length sums))

(define (list-sum lst)
  (if (null? lst) 0 (+ (car lst) (list-sum (cdr lst)))))

(define mean (/ (list-sum sums) n))
(define var-val
  (/ (list-sum (map (lambda (s) (* (- s mean) (- s mean))) sums)) n))

(display "Enumerated mean of X+Y: ") (display (* 1.0 mean)) (newline)
(display "Expected (3.5 + 3.5):   7.0") (newline)
(display "Enumerated variance:     ") (display (* 1.0 var-val)) (newline)
(display "Expected (2.917+2.917):  5.833")

Notation reference

Notation	Scheme	Meaning
P(Z = z) = ∑ P(X=k)P(Y=z-k)	(convolve pmf-x pmf-y ...)	Convolution of PMFs
E[X+Y] = E[X]+E[Y]	(+ mu-x mu-y)	Means always add
Var(X+Y) = Var(X)+Var(Y)	(+ var-x var-y)	Variances add (if independent)
Cov(X,Y)	E[(X-μx)(Y-μy)]	Covariance: dependence correction

Neighbors

Probability chapters

🎰 Ch 6 — expected value and variance (prerequisites for this chapter)
🎰 Ch 5 — the named distributions that get convolved here
🎰 Ch 8 — law of large numbers (why sums of many variables become predictable)

Foundations (Wikipedia)

Ready for the real thing? Read Grinstead & Snell, Chapter 7.

The Handshake

← Expected Value by june.kim Law of Large Numbers →