← back to statistics

Inference for Proportions

OpenIntro Statistics · CC BY-SA 3.0 · Chapter 6

When data are counts or categories, proportions replace means. The Z-test for proportions uses the normal approximation to the binomial. The chi-square test extends this to tables with many categories.

One-proportion Z-test

Test whether a sample proportion p-hat differs from a hypothesized value p0. The test statistic is Z = (p-hat - p0) / sqrt(p0 * (1 - p0) / n). Under the null, Z follows a standard normal distribution when n is large enough (np0 and n(1 - p0) both at least 10).

Scheme

Two-proportion Z-test

Compare proportions from two independent groups. Under the null hypothesis p1 = p2, pool the data to estimate the common proportion, then compute the standard error from the pooled value.

0.3 0.5 0.7 Group 1 0.45 Group 2 0.58 overlap
Scheme

Chi-square goodness of fit

Test whether observed counts across k categories match expected proportions. The test statistic sums (observed - expected)^2 / expected across all cells. It follows a wpchi-square distribution with k - 1 degrees of freedom.

Scheme

Chi-square test for independence

Test whether two categorical variables are associated. Arrange data in a contingency table, compute expected counts from row and column totals, then apply the chi-square formula. Degrees of freedom = (rows - 1)(cols - 1).

Scheme

Notation reference

Symbol Formula Meaning
Z(p̂ - p0) / SETest statistic for proportions
SEsqrt(p0(1-p0)/n)Standard error under H0
x / nSample proportion
χ²Σ(O-E)²/EChi-square test statistic
df(r-1)(c-1)Degrees of freedom (independence)
Neighbors

Foundations (Wikipedia)