Compilation
Abelson & Sussman · 1996 · SICP §5.5
A compiler translates Scheme to register machine code ahead of time. The compiled code is faster because decisions made at compile time don’t repeat at runtime. The compiler itself is a Scheme program.
Structure of the compiler
The compiler mirrors the evaluator’s structure: dispatch on expression type, generate instructions for each case. But instead of executing, it returns a sequence of instructions to be executed later. The instruction sequences carry metadata: which registers they need and which they modify.
Compiling expressions
compile dispatches on expression type, just like eval. Each clause returns an instruction sequence. compile-self-evaluating loads a constant into val. compile-variable emits a lookup instruction. The target parameter says which register to put the result in — not always val.
Compiling combinations
Compiling (f a b) means: compile the operator, compile each operand, then emit an apply. The compiler can see at compile time how many arguments there are and whether saves are needed — the evaluator figures this out fresh every time. This is where compilation wins.
Lexical addressing
The compiler knows the structure of the environment at compile time. Instead of searching by name at runtime, it can emit a lexical address: (frame-number, offset). Frame 0 is the current frame, frame 1 is the enclosing one. This replaces a linear search with a direct index.
Interfacing compiled code with the evaluator
Compiled code and interpreted code must interoperate. The compiler produces register machine instructions that use the same register conventions as the explicit-control evaluator. A compiled procedure can call an interpreted one and vice versa. The entry point protocol — arguments in argl, result in val — is the interface contract.
Notation reference
| Scheme / SICP | Python | Meaning |
|---|---|---|
| (compile exp target linkage) | compile(ast) → bytecode | Translate expression to instructions |
| target register | destination variable | Where to put the result |
| linkage (next / return / goto) | fall-through / return / jump | What to do after this code |
| lexical address (d . o) | LOAD_FAST / LOAD_DEREF | Compile-time variable position |
| preserving | register allocation | Save only if needed (optimization) |
| compiled-apply / interp-apply | CALL_FUNCTION | Unified calling convention |
Translation notes
The SICP compiler targets the register machine from Ch.19. CPython’s compiler targets the CPython VM bytecode. The structural parallel is exact: compile_self_evaluating emits LOAD_CONST, compile_variable emits LOAD_NAME or LOAD_FAST, compile_application emits CALL_FUNCTION. Lexical addressing in SICP is what CPython does with LOAD_FAST (local variables at known offsets). The preserving combinator — only save registers that the next sequence actually needs — is the simplest form of register allocation.
Neighbors
Other reading pages
- SICP Ch.14 — the metacircular evaluator (source language)
- SICP Ch.21 — the explicit-control evaluator (target machine)
Foundations (Wikipedia)