Discrete Probability

Grinstead & Snell · GFDL · PDF

A probability function assigns a number between 0 and 1 to every outcome, and the numbers must add up to 1. That is the entire foundation. Everything else in probability is built on these two constraints.

Sample spaces and events

An experiment has a set of possible outcomes called the sample space (written Ω). An event is any subset of the sample space. Rolling a die: Ω = {1, 2, 3, 4, 5, 6}. "Rolling even" is the event {2, 4, 6}.

Scheme

; Sample space = list of outcomes
; Event = subset (filtered list)

(define die '(1 2 3 4 5 6))

(define (event? outcome)
  (= 0 (modulo outcome 2)))

(define even-event (filter event? die))

(display "sample space: ") (display die) (newline)
(display "even event:   ") (display even-event) (newline)
(display "size of event: ") (display (length even-event))

The probability function axioms

A probability function P assigns a weight to each outcome. Three axioms define it:

Non-negative: P(ω) ≥ 0 for every outcome ω
Sums to 1: the weights of all outcomes add to 1
Additive: for disjoint events A and B, P(A ∪ B) = P(A) + P(B)

The probability of an event is the sum of the probabilities of its outcomes.

Scheme

; Probability function: assign weights, check axioms

(define (make-prob-fn weights)
  ; weights is a list of (outcome . probability) pairs
  (lambda (outcome)
    (let ((pair (assoc outcome weights)))
      (if pair (cdr pair) 0))))

; Fair die: each outcome gets 1/6
(define fair-die
  (map (lambda (x) (cons x (/ 1 6))) '(1 2 3 4 5 6)))

(define P (make-prob-fn fair-die))

; Check axiom 1: non-negative
(display "P(1) = ") (display (P 1)) (newline)

; Check axiom 2: sums to 1
(define total (apply + (map cdr fair-die)))
(display "sum = ") (display total) (newline)

; Probability of an event = sum of outcome probabilities
(define (prob-event P outcomes)
  (apply + (map P outcomes)))

(display "P(even) = ")
(display (prob-event P '(2 4 6))) (newline)
(display "P(odd)  = ")
(display (prob-event P '(1 3 5)))

Uniform distribution on finite sets

When every outcome is equally likely, P(ω) = 1/n where n = |Ω|. The probability of an event E is just |E|/|Ω|. Counting favorable outcomes over total outcomes is the oldest probability calculation there is.

Scheme

; Uniform distribution: P(E) = |E| / |omega|
; This is the "classical" definition of probability

(define (uniform-prob event sample-space)
  (/ (length event) (length sample-space)))

; Two dice: sample space is all pairs
(define two-dice
  (let loop ((i 1) (pairs '()))
    (if (> i 6) pairs
        (loop (+ i 1)
              (append pairs
                (map (lambda (j) (list i j))
                     '(1 2 3 4 5 6)))))))

(display "total outcomes: ") (display (length two-dice)) (newline)

; Event: sum equals 7
(define sum-7 (filter (lambda (p) (= 7 (+ (car p) (cadr p)))) two-dice))
(display "ways to get 7: ") (display (length sum-7)) (newline)
(display "P(sum=7) = ") (display (uniform-prob sum-7 two-dice)) (newline)

; Event: doubles
(define doubles (filter (lambda (p) (= (car p) (cadr p))) two-dice))
(display "P(doubles) = ") (display (uniform-prob doubles two-dice))

Coin flips and repeated trials

Flip a fair coin n times. The sample space has 2^n outcomes, each equally likely. The number of sequences with exactly k heads is "n choose k" (a preview of Chapter 3). For 3 flips, the probability of exactly 2 heads is 3/8.

Scheme

; Coin flips: enumerate all sequences of length n
(define (coin-sequences n)
  (if (= n 0) '(())
      (let ((rest (coin-sequences (- n 1))))
        (append
          (map (lambda (s) (cons 'H s)) rest)
          (map (lambda (s) (cons 'T s)) rest)))))

(define flips3 (coin-sequences 3))
(display "all 3-flip sequences: ") (display (length flips3)) (newline)

; Count heads in a sequence
(define (count-heads seq)
  (length (filter (lambda (x) (eq? x 'H)) seq)))

; P(exactly 2 heads in 3 flips)
(define two-heads (filter (lambda (s) (= 2 (count-heads s))) flips3))
(display "sequences with 2 heads: ") (display two-heads) (newline)
(display "P(2 heads) = ") (display (/ (length two-heads) (length flips3)))

Notation reference

Textbook	Scheme	Meaning
Ω	die, two-dice	Sample space
E ⊆ Ω	(filter pred omega)	Event (subset)
P(ω)	(P outcome)	Probability of outcome
P(E) = \|E\|/\|Ω\|	(uniform-prob event omega)	Uniform probability
∑ P(ω) = 1	(apply + (map cdr weights))	Normalization axiom

Neighbors

Next chapters

🎰 Ch 2: Continuous Probability — what happens when the sample space is uncountable
🎰 Ch 3: Combinatorics — counting the outcomes that make uniform probabilities work
🎰 Ch 4: Conditional Probability — updating probabilities with new information

Paper pages

🍞 Fritz 2020 — Markov categories axiomatize probability with morphisms instead of measure theory
🍞 Staton 2025 — probabilistic programming composes distributions like programs

Foundations (Wikipedia)

Ready for the real thing? Read Ch 1 of Grinstead & Snell.

by june.kim Continuous Probability →