Diff
Chapter 4 ยท Part II begins ยท Calcagno 2009, O'Hearn 2019, Arjovsky 2019, Rubin 1915
Chapter 3 showed the same abstract schema under eight names. This chapter defines the primitive: snapshot before, snapshot after, XOR. What flipped is figure; what held is ground. Diff is the contrast that abduction operates over.
The mechanic
A mechanic taps the alternator and the engine stalls. Two hypotheses fire: test the battery, test the voltage regulator. Nobody taught her a hypothesis-generation algorithm. The shape of the failure named the next experiment.
Where did the hypotheses come from? She compared two states: engine running, engine stalled. She noted what changed (alternator tapped) and what stayed the same (fuel, spark, coolant). The change pointed at the electrical subsystem. The hypotheses followed from the diff.
Diff is computable. The mechanic performed it in her head. OBD-II performs it in silicon. Facebook Infer performs it on code. The representation changes; the operation stays the same.
The primitive
Diff is the substrate of abduction, not abduction entire. It produces the contrast that abduction operates over:
Given expected state E, observed state O, and background frame B, partition what differs from what remains invariant.
Abduction then uses that partition to generate candidate causes H: minimal revisions to B that would make O unsurprising. The partition itself is the primitive. All eight names in Chapter 3 instantiate it. The simplest encoding: take two snapshots of a system's state, one before an event and one after. The comparison partitions state into two sets:
Figure — what changed. Variables whose values differ between snapshots. In Gestalt terms, what pops out against a stable background.
Ground — what held. Variables whose values stayed the same. The context that remained invariant while the figure shifted. In separation logic, the frame that bi-abduction infers.
Formally:
diff(state_before, state_after) → (figure, ground) The partition is symmetric: it doesn't matter which snapshot is “first.” But use is asymmetric. The before-state is the baseline; the after-state is the perturbation. The diff tells you what the perturbation touched.
Caution: diff gives candidates, not causes. A changed variable may be cause, effect, symptom, intervention, or coincident noise. The mechanic's diff puts both alternator tapped (intervention) and engine stalled (effect) in the figure. The diff alone does not know which is which. Separating cause from effect requires a dependency graph or an experiment. The diff just names what changed.
The XOR
The simplest instantiation: state is a set of key-value pairs. The figure is every key whose value changed. The ground is every key whose value didn't.
Apply it to the mechanic's engine:
Two figure entries: the alternator (input) and the engine state (output). Four ground entries: battery, fuel pump, spark, coolant, all unchanged.
The hypotheses follow from the figure. The alternator was tapped, the engine stalled. What connects them? The electrical subsystem. Battery supplies power; voltage regulator manages alternator output. Both sit upstream of the engine and both fit a tap-induced failure. The ground tells the mechanic what to deprioritize: fuel-pump and spark are unchanged (though that doesn't eliminate them unless the measurement set is complete).
This is the contrast primitive. Hypothesis generation, ranking, and testing build on top of it.
Degrees of freedom
The diff above is the minimal case: one before, one after, one partition. The variants in the literature add degrees of freedom to this primitive:
| Variant | Inputs | Outputs | What it adds |
|---|---|---|---|
| Unary diff | One before, one after | Figure + ground | The primitive. This chapter. |
| Bi-abduction | Partial before, partial after | Inferred frame + inferred anti-frame | Infers the ground autonomously. Ch 5. |
| Incorrectness | One before, one after | Under-approximation of bugs | Flip polarity: attend to failure, not success. |
| Tri-abduction | Fork: shared start, two branches | Causal edge (what the branch changed) | Diff across branches, not just time. Ch 6. |
Each step adds an operand. One snapshot pair gives one frame. Two pairs (actual and counterfactual) give one causal edge. N pairs across N branches give a typed subgraph. The pattern stays diff; the arity grows.
Three witnesses
Three systems, three decades, three fields. Each encodes the diff. They are rarely presented as instances of one operation.
OBD-II (1996) — hardcoded diff
OBD-II reads sensor states, diffs against expected values, and generates fault codes. (Real OBD-II adds thresholds, monitors, and enable conditions; "hardcoded diff" is a simplification of the core logic.) Vehicles have run this since 1996.
But OBD-II is hardcoded. The fault tree is hand-authored, the hypotheses enumerated in advance. An engineer decided which deviations map to which codes. If a failure mode wasn't anticipated, nothing fires. The primitive works; it just runs on a fixed table.
The limitation is the table. Every hypothesis must be written down before the system ships. No table entry, no hypothesis. The diff is there; the inference is manual.
Facebook Infer (2009) — automated diff
Infer (Calcagno et al. 2009) runs bi-abduction on millions of lines of production code. Given a function's precondition and postcondition, it infers the frame (heap the function didn't touch) and the anti-frame (heap that must exist for the function to be safe).
The figure is what the function modifies; the ground is what it leaves alone. Infer doesn't require the programmer to specify the ground. That's the “bi”: abduction in both directions, computing what must hold before and what the function preserves.
Infer moved the diff from a hand-authored table (OBD-II) to an automated inference engine. Same primitive, higher automation.
IRM (2019) — learned diff
Invariant Risk Minimization (Arjovsky et al. 2019) uses environment variation to force figure/ground separation. Train a model across multiple environments. Features that predict the outcome in all environments are invariant (ground). Features that predict in some but not others are spurious (figure).
IRM diffs across environments rather than across time. Instead of comparing two snapshots of one system, it compares the same learning task under different conditions. Invariant features are ground; environment-specific features are figure. (Note the inversion: in IRM, the invariant features are the causal signal you want. Calling them "ground" follows the changed/unchanged definition but reverses the ordinary sense of "figure = thing of interest." The role assignment holds; the valence flips.)
| System | Year | Diff over | Figure | Ground |
|---|---|---|---|---|
| OBD-II | 1996 | Expected vs. observed sensor values | Fault codes (hand-authored) | Normal operating range (hand-authored) |
| Infer | 2009 | Precondition vs. postcondition | Modified heap (automated) | Frame: untouched heap (inferred) |
| IRM | 2019 | Environment A vs. environment B | Spurious features (learned) | Invariant features (learned) |
Three encodings. Handcoded table (OBD-II), automated inference (Infer), learned separation (IRM). IRM is the loosest analogue; it seeks invariant representations rather than literally diffing states. But the structural role holds: partition observations into signal and noise.
Code: the full loop
Combine the primitive with hypothesis generation. Given a diff, produce candidate explanations by examining what the figure touches in a dependency graph.
Five hypotheses from a two-variable diff. The dependency graph constrains which components are candidates. The ground (fuel pump, spark, coolant) eliminates their upstream components. Battery, voltage regulator, ECU, alternator belt, alternator bearing: the five things to check.
Notice what the code does not do. It does not test the hypotheses, rank them, or estimate their probability. It generates them. The diff is a hypothesis-generation primitive. Testing is induction. Ranking is economy of research (ch 7). The diff names the candidates.
What breaks
The diff requires you to know what to observe. Every variable was chosen by someone: the mechanic who checked six gauges, the engineer who wired six sensors, the programmer who logged six fields. If the relevant state lives in a variable nobody snapshotted, the diff misses it.
Return to the mechanic. She checked six variables but not the serpentine belt tension. If the real cause is a worn belt that slipped when she tapped the alternator, the diff cannot find it. The belt isn't in either snapshot. Not in the figure, not in the ground. Absent.
This is the structural limitation. The unary diff partitions observed state into figure and ground. It cannot reason about unobserved state. The hypothesis space is bounded by what you chose to measure.
OBD-II shows this concretely: if a failure involves a sensor the ECU doesn't monitor, no code fires. Engineers add more sensors, but you can never instrument everything. Some state will always be unmeasured.
Bi-abduction addresses this by inferring the frame: the state the operation must not have touched for the result to be valid. It reasons backward from the postcondition, including state never explicitly observed. The diff goes from "compare what you measured" to "infer what you must have missed."
That is Chapter 5.
Sources
| Rubin 1915 | Synsoplevede Figurer. Figure-ground segregation in visual perception. The perceptual ancestor of diff. |
| SAE J1962 (1996) | OBD-II standard. Diagnostic connector, protocol, and trouble code conventions. The diff, hardcoded since 1996. |
| Calcagno et al. 2009 | "Compositional Shape Analysis by Means of Bi-Abduction." POPL. The frame inference engine behind Facebook Infer. Automated figure/ground from separation logic. |
| Arjovsky et al. 2019 | "Invariant Risk Minimization." Environment variation as the lever for figure/ground separation. Learned diff across training conditions. |
| O'Hearn 2019 | "Incorrectness Logic." POPL. Flip the polarity of the diff: under-approximate bugs instead of over-approximating correctness. |
| Ernst et al. 2001 | "Dynamically Discovering Likely Program Invariants to Support Program Evolution." Daikon: infer invariants (ground) from observed execution traces. More samples, sharper ground. |
Neighbors
- Methodeutics
- Ch 2: Security and Uberty — the tradeoff that makes abduction fertile
- Ch 7: Economy of Research — selecting among the hypotheses this chapter generates
- Abduction — the blog post this chapter formalizes
External
- Facebook Infer — bi-abduction in production
OBD-II (Wikipedia)
- Arjovsky et al. 2019 — Invariant Risk Minimization (arXiv)