Multiple regression uses several predictors to model a continuous outcome. Logistic regression models a binary outcome using the sigmoid function. Both extend simple regression: one by adding predictors, the other by changing the link function.
Multiple predictors
The model y = b0 + b1*x1 + b2*x2 + ... + bp*xp uses p predictors. Each coefficient bi represents the expected change in y for a one-unit increase in xi, holding all other predictors constant. This "holding constant" interpretation is the difference from running p separate simple regressions.
R-squared always increases when you add a predictor, even a useless one. Adjusted R-squared penalizes for the number of predictors: R-adj = 1 - (SSE/(n-p-1)) / (SST/(n-1)). It only increases when the new predictor improves the model more than chance would predict.
When predictors are correlated with each other, individual coefficients become unstable: large standard errors, sign flips, sensitivity to which observations are included. The variance inflation factor (VIF) quantifies this: VIF = 1 / (1 - R^2_j), where R^2_j is from regressing predictor j on all other predictors. VIF above 5-10 signals trouble.
Scheme
; Variance Inflation Factor; VIF = 1 / (1 - R^2_j); R^2_j = how well predictor j is predicted by other predictors; If x1 and x2 have correlation r = 0.9:; R^2 of x1 ~ x2 is roughly r^2 = 0.81
(define (vif r-squared) (/ 1 (- 1 r-squared)))
(display "Correlation between predictors -> VIF:") (newline)
(display " r = 0.0: VIF = ") (display (vif 0.00)) (newline)
(display " r = 0.5: VIF = ") (display (vif 0.25)) (newline)
(display " r = 0.7: VIF = ") (display (vif 0.49)) (newline)
(display " r = 0.9: VIF = ") (display (vif 0.81)) (newline)
(display " r = 0.95: VIF = ") (display (vif 0.9025)) (newline)
(display "Rule of thumb: VIF > 5 is concerning")
Python
# VIF for different correlations
vif = lambda r_sq: 1 / (1 - r_sq)
for r in [0.0, 0.5, 0.7, 0.9, 0.95]:
print(f"r = {r}: VIF = {vif(r**2):.2f}")
Logistic regression
For a binary outcome (0 or 1), linear regression can predict probabilities outside 0-1. Logistic regression fixes this by modeling the log-odds: log(p/(1-p)) = b0 + b1*x. Solving for p gives the sigmoid: p = 1 / (1 + e^-(b0+b1*x)). The coefficients represent changes in log-odds, not in probability.
Scheme
; Logistic regression: the sigmoid function; log(p / (1-p)) = b0 + b1*x; p = 1 / (1 + e^(-(b0 + b1*x)))
(define b0 -4)
(define b1 0.8)
(define (sigmoid z) (/ 1 (+ 1 (exp (- z)))))
(define (predict-prob x) (sigmoid (+ b0 (* b1 x))))
(display "P(y=1) at different x values:") (newline)
(define test-xs (list 02456810))
(for-each (lambda (x)
(display " x = ") (display x)
(display " -> p = ") (display (predict-prob x))
(newline))
test-xs)
; At x=5, the log-odds = b0 + b1*5 = -4 + 4 = 0; So p = 0.5: this is the decision boundary
(display "Decision boundary at x = ")
(display (/ (- b0) b1))
Python
# Logistic regressionimportmath
b0, b1 = -4, 0.8
sigmoid = lambda z: 1 / (1 + math.exp(-z))
predict = lambda x: sigmoid(b0 + b1 * x)
print("P(y=1) at different x values:")
for x in [0, 2, 4, 5, 6, 8, 10]:
print(f" x = {x} -> p = {predict(x):.4f}")
print(f"Decision boundary at x = {-b0/b1}")
Odds ratios
The odds of an event is p/(1-p). In logistic regression, e^b1 is the odds ratio: the factor by which the odds multiply for each one-unit increase in x. An odds ratio of 2.23 means the odds more than double per unit increase. This is the natural scale for interpreting logistic coefficients.
Scheme
; Odds ratios; If b1 = 0.8, then odds ratio = e^0.8
(define b1 0.8)
(define odds-ratio (exp b1))
(display "b1 = ") (display b1) (newline)
(display "Odds ratio = e^b1 = ") (display odds-ratio) (newline)
(display "Interpretation: each unit increase in x") (newline)
(display " multiplies the odds by ") (display odds-ratio) (newline)
; Example: if baseline odds = 1:3 (p=0.25)
(define baseline-odds (/ 13))
(display "Baseline odds: ") (display (exact->inexact baseline-odds)) (newline)
(define new-odds (* baseline-odds odds-ratio))
(display "After x+1: ") (display new-odds) (newline)
; Convert odds to probability
(define (odds->prob odds) (/ odds (+ 1 odds)))
(display "Baseline prob: ") (display (odds->prob baseline-odds)) (newline)
(display "New prob: ") (display (odds->prob new-odds))