← back to methodeutics

Validators

Chapter 13 · Four validator types, one interface

Chapter 12's prereg checklist catches self-deception: double-dipping, post hoc framing, data leakage. But claims come in different kinds, and each kind needs a different validator.

  • Empirical validators run the experiment, measure the outcome. They need perturbation access and fresh data.
  • Structural validators check that the claim's logical structure holds: consistency, completeness, independence. No new data needed.
  • Proof validators derive consequences and check if they hold. Schoenfeld's kill conditions: when induction fails, the failure mode names the next technique.
  • Counterexample validators search for cases that violate the claim. Lakatos's guilty lemma: the counterexample reveals which step was wrong.

This chapter builds out the proof and counterexample validators. Structural validators use the kill condition and symmetry mismatch tables below. Empirical validators belong to chapters 8–9 (evidence trajectories).

The wild goose chase

Schoenfeld (1985) filmed undergraduates and professors solving the same problems. The footage revealed a pattern so consistent it deserved a name: the wild goose chase. A student picks a technique (usually induction) and applies it. It fails. The student applies it again, harder. Different notation, same strategy. The entire session burns on a single approach. The clock runs out with nothing proved.

The professors knew the same techniques. The difference was control: recognizing when a technique is failing, diagnosing why, and switching to whatever the failure mode points to. Teaching control strategies improved problem-solving more than teaching new techniques.

But Schoenfeld never specified what to switch to. He said "monitor and assess," leaving the assessment to intuition. A professor has seen enough failures to pattern-match the mode. An autonomous system needs something explicit: a lookup table mapping failure modes to next techniques.

The lookup table Schoenfeld left blank

The proof manual operationalizes Schoenfeld's control into a five-step procedure:

  1. Classify the claim type. Existence, bound, construction, uniqueness, impossibility, equivalence. Each has a different proof shape. Existence needs a witness (Σ type). Impossibility needs a function into False (Π type). The classification picks the row.
  2. Classify the domain. Discrete, continuous, algebraic, geometric, probabilistic, number-theoretic, combinatorial. The domain picks the column.
  3. Grid lookup. Row-column intersection gives candidate techniques. Scan the whole row, not just your first instinct. The grid has over 100 entries.
  4. Check kill conditions. Cross off the dead ones before you start. A kill condition is a structural property that makes a technique provably unable to succeed. Three minutes up front saves three hours of failed attempts.
  5. Check symmetry mismatches. A technique that assumes a symmetry your problem lacks produces a valid-looking argument with a hidden gap. These fail silently. Check explicitly.

Try the survivors. When one dies, the failure mode names the next.


Kill conditions: pre-flight and post-mortem in one table

Kill conditions serve two validators. The structural validator uses them as a pre-flight check: does the claim's structure satisfy the technique's preconditions? The proof validator uses them as a post-mortem: the shape of the failure identifies which technique to try next.

Technique Kill condition Escalate to
Induction Residual loses structure — the inductive step can't close because the quantity you're tracking has no monotone relationship between n and n+1 Potential method
Contradiction ¬P doesn't interact with the structure — negating the conclusion yields nothing usable Direct construction
Greedy Local progress destroys substructure — each greedy step improves the objective but corrupts an invariant you need later Potential method
Pigeonhole Same-size sets, or you need a witness not just existence Probabilistic method
Probabilistic method E[X] < 1 (expectation too low), or dependencies between events, or you need an explicit witness Second moment → LLL → derandomization
Spectral Semiring without eigenvalues — the algebraic structure doesn't support spectral decomposition Embed-solve-pullback
IVT / fixed point Discrete or non-compact domain Sperner, simplicial Brouwer
Invariant No separating invariant visible — the before and after states can't be distinguished by any conserved quantity you can find Reduction
Potential method No monotone potential — you can't find a quantity that decreases on every step Game equilibrium
Diagonalization Uncountable candidates — the diagonal construction requires enumeration Reduction

Read the table vertically. The kill at step N names the technique at step N+1. Induction fails because the residual loses structure; you escalate to a potential method that drops per-step monotonicity. The potential method fails because no monotone potential exists; you escalate to a game equilibrium where the potential is the equilibrium itself. Each failure is a signpost.

Same mechanism as Chapter 10, different edge of the triangle. There, kill conditions generated edges in the hypothesis graph: a monotone trend test fires but curvature is indeterminate, so the failure mode names the next hypothesis. Here, kill conditions generate edges in the proof graph: induction fails because the residual is non-monotone, so the failure mode names the next technique. Chapter 10 navigated the abductive edge (Observation → Theory). This chapter navigates the deductive edge (Theory → Experiment).


Symmetry mismatches fail silently

Kill conditions are loud: the technique fails and you know it. Symmetry mismatches are the structural validator's other job, catching silent failures. The technique appears to succeed, producing a valid-looking argument, but the argument has a hidden gap because it assumed a symmetry the problem lacks.

You assume It's actually What dies
Undirected Directed Union-find, spanning trees — they assume you can traverse edges in either direction
Transitive Non-transitive Reachability composition — A→B and B→C does not imply A→C
Time-independent Time-dependent Static data structures — the answer depends on when you ask
Commutative Non-commutative Abelian group tools — AB ≠ BA invalidates half your rewrites
Local Global Heuristics, distributed algorithms — local optima are not global optima
Linear Nonlinear Superposition, spectral decomposition — f(a+b) ≠ f(a) + f(b) destroys the entire toolkit

Union-find on a directed graph produces an output. The output is wrong, but it looks right: the algorithm didn't crash, it silently treated a directed edge as undirected. Kill conditions help you fail fast. Symmetry mismatches help you avoid failing silently.

In practice: six questions before you start. Is the relation directed? Transitive? Time-varying? Commutative? Local? Linear? Any mismatch with your technique's assumptions is a flag.


Counterexamples indict lemmas, not theorems

Lakatos (1976) described what happens when the counterexample validator fires: the proof breaks in the middle. In Proofs and Refutations, he traces Euler's formula (V - E + F = 2) through a series of proofs, counterexamples, and repairs. Each counterexample doesn't refute the theorem outright; it reveals which lemma was wrong. Lakatos called this the guilty lemma.

The method is forensic. You have a proof that seemed valid. A counterexample violates the claim. Instead of discarding the entire proof, you trace the counterexample through the argument step by step until it hits a lemma that can't accommodate it. That lemma is guilty. The repair is to restrict it (exclude the counterexample from the domain) or replace it (find a stronger lemma that covers the counterexample).

The counterexample validator and the proof validator are complementary. The kill condition table says: "Induction fails when the residual loses structure." The counterexample validator says: "Here is a specific case where the residual lost structure." Same mechanism, different direction. Structural validators check preconditions up front; counterexample validators find the guilty step after the proof breaks.

An autonomous system needs all four. Structural validators are pre-flight. Proof validators drive the escalation chain. Counterexample validators are post-crash analysis: the failure trace identifies the guilty step, which names the next technique. Empirical validators close the loop by running the experiment.


Every technique is a tombstone for its parent

The kill condition table is synchronic: what to try next right now. The lineage is diachronic: how techniques evolved through historical failures.

Exhaustion (Archimedes)
  kill: can't handle infinite processes
  → Limits (Cauchy, Weierstrass)
     kill: need compactness for existence
     → Compactness arguments (Bolzano-Weierstrass)
        kill: need topology beyond R^n
        → General topology

Counting (Euler)
  kill: exact counts intractable
  → Generating functions
     kill: coefficients hard to extract
     → Analytic combinatorics (Flajolet)
        kill: need asymptotics not exact
        → Probabilistic method (Erdos, Alon & Spencer)

Diagonalization (Cantor)
  kill: need self-reference formalized
  → Incompleteness (Godel)
     kill: need computation model
     → Undecidability (Turing)
        kill: need quantitative hardness
        → Complexity lower bounds (Cook, Karp)

Each arrow is a kill condition. Archimedes' method of exhaustion died on infinite processes; Cauchy and Weierstrass invented limits. Limits died on spaces without metric structure; general topology was born. The history of proof techniques is a tree of escalations, each node created because its parent died.

A student who knows only induction will never reach for a potential method. A student who knows that potential methods exist because induction fails when the residual loses structure will reach for the right tool. The lineage is the kill condition table extended through time.


The proof manual is the deductive edge

Chapter 10 built the hypothesis graph. Observations produce hypotheses (abduction). Hypotheses predict observables (deduction). Experiments produce observations (induction). Kill conditions generate the edges: when hypothesis A fails, the failure mode names hypothesis B.

The proof manual occupies the deductive edge of that triangle. You have a hypothesis; you derive consequences and test them. The derivation step, Theory → Experiment, requires choosing a proof technique. The kill condition table tells you which technique to choose and what to do when it fails.

Edge Mode Chapter Kill conditions generate
Observation → Theory Abduction Ch 10 Next hypothesis from experimental failure
Theory → Experiment Deduction Ch 13 Next technique from proof failure
Experiment → Observation Induction Ch 8–9 Evidence trajectory classification

The parallel is exact. In chapter 10, a monotone trend test fires but curvature is indeterminate; "monotone but curvature unknown" names the next hypothesis: test for convexity. In chapter 13, induction fails because the residual loses structure; "non-monotone residual" names the next technique: potential method. Hypothesis graph edges come from experimental failures. Proof graph edges come from technique failures. Both obey the same principle: the shape of the failure names the next step.


Kill conditions are type errors

Every proof decomposes into compositions of six type constructors from the Calculus of Inductive Constructions:

Constructor What it proves
Π (dependent function) ∀, implication
Σ (dependent pair) ∃, witness
Inductive type Recursion, cases
Match Case analysis, induction
Quotient Equivalence
Truncation Non-constructive existence

The grid's rows map to these constructors: existence = Σ, impossibility = Π→False, construction = Σ with computability. Kill conditions are type errors: your proof term doesn't inhabit the target type. Lean's type checker rejects proofs for the same reasons the kill conditions predict. The kill condition table is a human-readable compilation of type errors.


When nothing works, change the domain

The kill condition table escalates within a domain. Embed-solve-pullback escalates across domains.

Source → Target What you gain What you risk
Combinatorics → 3-SAT Exponential search Clause structure artificial
Discrete → Geometry Convexity, separation Rounding loses feasibility
Nonlinear → Linear (LP/SDP) Poly-time solvers Integrality gap
Time domain → Frequency Convolution → multiplication Localization lost
Graph → Algebra (spectral) Eigenvalue bounds Semiring has no spectral theory

The risk is always the same: the pullback doesn't preserve the constraints. Embed a discrete problem into continuous geometry, solve with convexity, pull back. Rounding a continuous optimum to a discrete one can violate feasibility. Embedding buys power. Pullback costs fidelity. This tradeoff is itself a kill condition check: before embedding, ask whether the pullback preserves what matters.


One interface, four validators

All four validator types share one interface: accept a claim, return a verdict (alive, dead, or inconclusive) with a trace that names the next step. The code below implements the proof validator's core (grid lookup with kill conditions and escalation targets) as a concrete instance.

Python

escalation_chain makes the linked-list structure explicit. Start with any technique. When it fails, the kill condition names the next. Follow the chain until you reach a technique with no known kill condition: that's the frontier of the field. The full YAML grid has over a hundred techniques. The code above is a minimal implementation.


What the validators don't close

The four validators tell you what to try. They do not tell you how to wire them into a loop that runs without a human. The procedure is:

  1. Observe a surprising fact (abduction — ch 4–6).
  2. Generate a hypothesis (abduction — ch 10).
  3. Select an experiment (economy — ch 7).
  4. Validate the hypothesis — pick the right validator type, run it (this chapter).
  5. Run the experiment, classify the evidence trajectory (induction — ch 8–9).
  6. If the evidence rejects, the failure mode names the next hypothesis. Go to 2.

Each step has been formalized in a previous chapter. The steps are disconnected. Observe → classify → generate → validate → test → repeat: a pipeline. Writing the pipeline down is a different problem from formalizing each step. Chapter 12's checklist protects it from self-deception. Chapter 13's validators populate the deductive edge with four mechanisms sharing one interface. Wiring the validators into a cycle that closes without human intervention is the subject of Chapter 14.


Sources

Schoenfeld 1985 Mathematical Problem Solving. Academic Press. Documents the wild goose chase and the role of metacognitive control in expert problem solving.
Lakatos 1976 Proofs and Refutations. Cambridge University Press. The guilty lemma: counterexamples reveal which step in a proof is wrong.
Peirce 1903 Harvard Lectures on Pragmatism. MSS [R] 475–478, Houghton Library, Harvard. EP 2:133–241. The triangle: abduction (Observation → Theory), deduction (Theory → Experiment), induction (Experiment → Observation).
Alon & Spencer 2016 The Probabilistic Method. 4th ed. Wiley. The escalation chain from first moment through LLL to derandomization.
Lean 4 Theorem Proving in Lean 4: Dependent Type Theory. The six type constructors that generate all proof techniques.
The Proof Manual Blog post and YAML grid. The structured taxonomy of claim types, domains, techniques, and kill conditions.
Neighbors

External