← back to programming

Data Abstraction

Abelson & Sussman · 1996 · SICP §2.1

Separate how data is used from how it's represented. Constructors and selectors form an abstraction barrier. You can even implement pairs with nothing but functions.

Programs that use rational numbers abstraction barrier: make-rat, numer, denom Rational number operations (+rat, *rat, ...) abstraction barrier: cons, car, cdr Pairs (the underlying representation)

Rational numbers — make-rat, numer, denom

A rational number is a pair: numerator and denominator. The constructor make-rat and selectors numer, denom define the interface. Everything above the barrier uses only these three operations.

Scheme

Abstraction barriers

The rational number layer never touches cons, car, cdr directly. The programs that use rationals never touch numer or denom directly — they use add-rat, mul-rat. Each layer talks only to the layer below it. Change the representation, and only one layer needs updating.

Scheme

What is data? — procedural representation of pairs

The deepest insight in this chapter: you can implement cons, car, and cdr with nothing but lambda. Data isn't a thing stored in memory — it's any set of constructors and selectors that satisfies the contract: (car (cons x y)) = x and (cdr (cons x y)) = y.

Scheme

Interval arithmetic

Another data abstraction: intervals with upper and lower bounds. Same pattern — constructors, selectors, and operations that respect the barrier. The operations propagate uncertainty: the result interval contains all possible values.

Scheme

Notation reference

Scheme Python Meaning
(cons a b)(a, b)Construct a pair
(car p)p[0]First element
(cdr p)p[1]Second element
(gcd a b)math.gcd(a, b)Greatest common divisor
(error "msg")raise ValueError("msg")Signal an error
Neighbors

Other reading pages

Foundations (Wikipedia)

Translation notes

Scheme's cons/car/cdr are primitive; Python uses tuples or lists. The procedural pairs example (section 3) shows that the distinction is superficial — closures suffice. Python's gcd lives in math; Scheme's is typically built in. The interval arithmetic section foreshadows the "dependency problem" that SICP poses as an exercise: algebraically equivalent expressions give different interval widths because repeated variables are treated as independent. That subtlety is omitted here but worth exploring.

Ready for the real thing? Read SICP §2.1.