Symbolic Data
Abelson & Sussman ยท 1996 ยท SICP ยง2.3
Quotation lets programs manipulate symbols as data. Code that writes code. Symbolic differentiation shows the power: mathematical rules become pattern-matching functions. The derivative of a sum is the sum of derivatives — and that sentence is the program.
Quotation
The quote mark ' stops evaluation. 'a is the symbol a, not the value bound to a. symbol? tests whether something is a symbol. eq? tests whether two symbols are the same. This is the foundation: data that looks like code.
Symbolic differentiation
The derivative rules from calculus translate directly into code. Each rule is one clause in a conditional:
- Constant: derivative is 0.
- Variable (with respect to itself): derivative is 1.
- Sum: derivative of a sum is the sum of derivatives.
- Product: apply the product rule.
Sets as unordered lists
A set is a collection with no duplicates. The simplest representation: an unordered list. element-of-set? scans the whole list — O(n). adjoin-set adds only if not present. Simple but slow.
Sets as binary trees
Ordered trees give O(log n) lookup. Each node holds an entry, a left subtree (smaller entries), and a right subtree (larger entries). The tree representation is another example of the abstraction barrier: the set interface stays the same, only the performance changes.
Notation reference
| Scheme | Python | Meaning |
|---|---|---|
| 'x | "x" | Quoted symbol (data, not code) |
| (symbol? x) | isinstance(x, str) | Test if symbol |
| (eq? 'a 'b) | "a" == "b" | Symbol equality |
| '(+ x 3) | ("+", "x", 3) | Quoted expression as data |
| (deriv exp var) | deriv(exp, var) | Symbolic derivative |
| (make-tree e l r) | Tree(e, l, r) | Binary tree node |
Translation notes
Scheme's quotation is fundamental to the language: '(+ 1 2) is a list of three elements. Python has no equivalent — you build symbolic expressions from tuples or strings, which means the boundary between code and data is always explicit. This is the tradeoff: Lisp's homoiconicity makes metaprogramming natural at the cost of a less familiar syntax. Python's syntax is readable at the cost of making code-as-data awkward.
The simplification rules in make-sum and make-product are a preview of algebraic simplification. SICP keeps it minimal (0+x=x, 0*x=0, 1*x=x). A real computer algebra system would need hundreds of such rules.
Neighbors
SICP chapters
- ๐ช Chapter 5 โ Hierarchical Data (the list/tree structures quotation builds on)
- ๐ช Chapter 7 โ Multiple Representations (extending the idea of dispatch on type)
Foundations (Wikipedia)
Homoiconicity โ code and data share the same representation
Symbolic computation โ manipulating mathematical expressions as data
Binary search tree โ the tree-based set representation