When data are counts or categories, proportions replace means. The Z-test for proportions uses the normal approximation to the binomial. The chi-square test extends this to tables with many categories.
One-proportion Z-test
Test whether a sample proportion p-hat differs from a hypothesized value p0. The test statistic is Z = (p-hat - p0) / sqrt(p0 * (1 - p0) / n). Under the null, Z follows a standard normal distribution when n is large enough (np0 and n(1 - p0) both at least 10).
Scheme
; One-proportion Z-test; H0: p = 0.5, Ha: p != 0.5; Sample: 560 successes out of 1000
(define p-hat (/ 5601000))
(define p0 0.5)
(define n 1000)
(define se (sqrt (/ (* p0 (- 1 p0)) n)))
(define z (/ (- p-hat p0) se))
(display "p-hat = ") (display (exact->inexact p-hat)) (newline)
(display "SE = ") (display se) (newline)
(display "Z = ") (display z) (newline)
(display "Reject H0 at alpha=0.05? ")
(display (if (> (abs z) 1.96) "Yes""No"))
Python
# One-proportion Z-testimportmath
p_hat = 560 / 1000
p0 = 0.5
n = 1000
se = math.sqrt(p0 * (1 - p0) / n)
z = (p_hat - p0) / se
print(f"p-hat = {p_hat}")
print(f"SE = {se:.4f}")
print(f"Z = {z:.2f}")
print(f"Reject H0 at alpha=0.05? {'Yes' if abs(z) > 1.96 else 'No'}")
Two-proportion Z-test
Compare proportions from two independent groups. Under the null hypothesis p1 = p2, pool the data to estimate the common proportion, then compute the standard error from the pooled value.
Test whether observed counts across k categories match expected proportions. The test statistic sums (observed - expected)^2 / expected across all cells. It follows a chi-square distribution with k - 1 degrees of freedom.
Scheme
; Chi-square goodness of fit; Are dice rolls uniform? 60 rolls.; Observed: 8, 12, 10, 14, 7, 9
(define observed (list 812101479))
(define expected (list 101010101010))
; Chi-square = sum of (O - E)^2 / E
(define (chi-sq-term o e)
(/ (* (- o e) (- o e)) e))
(define chi-sq
(apply + (map chi-sq-term observed expected)))
(display "Observed: ") (display observed) (newline)
(display "Expected: ") (display expected) (newline)
(display "Chi-sq = ") (display (exact->inexact chi-sq)) (newline)
(display "df = 5") (newline)
; Critical value at alpha=0.05, df=5 is 11.07
(display "Reject H0? ")
(display (if (> chi-sq 11.07) "Yes""No"))
Python
# Chi-square goodness of fit
observed = [8, 12, 10, 14, 7, 9]
expected = [10, 10, 10, 10, 10, 10]
chi_sq = sum((o - e)**2 / e for o, e inzip(observed, expected))
print(f"Chi-sq = {chi_sq:.2f}")
print(f"df = 5")
print(f"Critical value (alpha=0.05) = 11.07")
print(f"Reject H0? {'Yes' if chi_sq > 11.07 else 'No'}")
Chi-square test for independence
Test whether two categorical variables are associated. Arrange data in a contingency table, compute expected counts from row and column totals, then apply the chi-square formula. Degrees of freedom = (rows - 1)(cols - 1).