A factor model decomposes an asset's return into exposure to a few common risk factors plus idiosyncratic noise. If you can identify the factors, you can separate the risk you're paid for from the risk you can diversify away.
Fama-French three-factor model
CAPM says one factor (the market) explains all expected returns. Fama and French showed two more matter: SMB (small minus big, the size premium) and HML (high minus low book-to-market, the value premium). A stock's expected excess return is β₁MKT + β₂SMB + β₃HML.
Scheme
; Fama-French three-factor model; R_i - Rf = alpha + beta1*MKT + beta2*SMB + beta3*HML + epsilon
(define (ff3-expected-return rf beta-mkt beta-smb beta-hml
mkt-premium smb-premium hml-premium)
(+ rf
(* beta-mkt mkt-premium)
(* beta-smb smb-premium)
(* beta-hml hml-premium)))
; Historical average premia (annualized, approximate)
(define mkt-prem 0.06) ; 6% market risk premium
(define smb-prem 0.02) ; 2% size premium
(define hml-prem 0.035) ; 3.5% value premium
(define rf 0.04) ; 4% risk-free rate; Small value stock: high exposure to all three factors
(define small-value (ff3-expected-return rf 1.20.80.7
mkt-prem smb-prem hml-prem))
(display "Small value expected return: ")
(display (exact->inexact small-value)) (newline)
; Large growth stock: market exposure, negative SMB and HML
(define large-growth (ff3-expected-return rf 1.0-0.3-0.4
mkt-prem smb-prem hml-prem))
(display "Large growth expected return: ")
(display (exact->inexact large-growth)) (newline)
; The difference is the factor premium, not alpha
(display "Spread: ")
(display (exact->inexact (- small-value large-growth)))
Principal component analysis extracts factors from the data itself. Compute the covariance matrix of returns, then eigendecompose. The first eigenvector is usually the market; subsequent ones capture sector, size, and other patterns. PCA factors are statistical, not economic—they maximize variance explained, not interpretability.
Total variance = systematic variance + idiosyncratic variance. Systematic risk comes from factor exposures and can't be diversified away. Idiosyncratic risk is stock-specific noise that vanishes in a large portfolio. R-squared from the factor regression tells you how much is systematic.
A factor mimicking portfolio is a long-short portfolio whose return tracks a given factor. For SMB: go long small caps, short large caps in equal dollar amounts. The portfolio has unit exposure to the size factor and zero net market exposure. This lets you trade abstract risk factors as concrete portfolios.
Scheme
; Factor mimicking portfolio construction; SMB: long small-cap basket, short large-cap basket
(define (portfolio-return weights returns)
(if (null? weights) 0
(+ (* (car weights) (car returns))
(portfolio-return (cdr weights) (cdr returns)))))
; 3 small stocks, 3 large stocks; Weights: +1/3 each small, -1/3 each large (dollar neutral)
(define smb-weights (list 1/31/31/3-1/3-1/3-1/3))
; Monthly returns: small stocks outperformed
(define month-returns (list 0.050.030.04; small caps0.010.020.015)) ; large caps
(define smb-return (portfolio-return smb-weights month-returns))
(display "SMB factor return: ")
(display (exact->inexact smb-return)) (newline)
; Net investment is zero (long-short)
(define net-investment (portfolio-return
(list 111111)
smb-weights))
(display "Net dollar exposure: ")
(display (exact->inexact net-investment)) (newline)
; HML: long high B/M, short low B/M
(define hml-weights (list 1/31/31/3-1/3-1/3-1/3))
(define hml-returns (list 0.040.0350.03; value stocks0.010.0150.02)) ; growth stocks
(define hml-return (portfolio-return hml-weights hml-returns))
(display "HML factor return: ")
(display (exact->inexact hml-return))
Python
# Factor mimicking portfoliodef portfolio_return(weights, returns):
returnsum(w*r for w, r inzip(weights, returns))
# SMB: long small, short large (dollar neutral)
smb_weights = [1/3, 1/3, 1/3, -1/3, -1/3, -1/3]
month_returns = [0.05, 0.03, 0.04, 0.01, 0.02, 0.015]
smb_ret = portfolio_return(smb_weights, month_returns)
print(f"SMB factor return: {smb_ret:.4f}")
print(f"Net dollar exposure: {sum(smb_weights):.1f}")
# HML: long value, short growth
hml_returns = [0.04, 0.035, 0.03, 0.01, 0.015, 0.02]
hml_ret = portfolio_return(smb_weights, hml_returns)
print(f"HML factor return: {hml_ret:.4f}")
Running the regression
To estimate factor loadings, regress excess returns on the factor returns. The betas measure sensitivity; alpha measures unexplained return. A positive, statistically significant alpha means the asset outperforms its factor-predicted return—genuine skill or a missing factor.