The Hypothesis Graph

Chapter 10 · Kill conditions generate edges

A monotone trend test fires but curvature is indeterminate. Is the system decelerating or drifting? You can’t tell from the current data. But you know exactly what to run next: a longer experiment that resolves curvature. The failure mode named the next hypothesis.

Chapter 9 classified trajectories into four bins: convergent, divergent, oscillatory, aperiodic. Given a trajectory, you can assign a label. But a label is a noun, not a verb. It tells you what the system did, not what to do next.

After Chapter 9, you look at a trajectory and say "oscillatory." After this chapter, you say "oscillatory — test the interface between the two subsystems that are fighting." The label becomes an action. The mechanism is the kill condition.

Failure modes are constructive

The failure mode of a test names the next experiment.

Every test in the classification tree either fires (assigns a label) or misfires (the data is ambiguous). When a test fires, you get a classification. When it misfires, you get something more valuable: a specific statement about what the data cannot resolve. That statement is a hypothesis pointing to the experiment that would resolve it.

The kill condition is a decision point where the test lacks the information to proceed. What the test needed and didn't have specifies the next measurement. The failure mode builds the graph.

The kill-condition decision tree

The classifier from Chapter 9, as a decision tree. Each branch is annotated by its failure mode.

Test for monotone trend. Is the trajectory consistently increasing or decreasing? If no trend, skip to step 3 (periodicity).
If monotone, test curvature. Is the rate of change decelerating (convergent) or constant/accelerating (divergent)?
Test for spectral peaks. Does the frequency spectrum have a narrow peak? If yes: oscillatory.
Test for aperiodic structure. Is there structure that isn't periodic? If yes: aperiodic.
Nothing triggered. Null.

Five tests. Each test that fires produces a label. Each test that misfires produces a hypothesis: an edge pointing to the experiment that would resolve the ambiguity.

Misfires are edges

Three examples of misfires and the edges they generate.

Misfire	What the test needed	Edge (next experiment)
Monotone trend detected, curvature indeterminate	More samples to distinguish deceleration from drift	Run a longer experiment
Spectral peak detected but broad	Resolution to distinguish a noisy cycle from colored noise	Test at a different frequency / longer window
Nothing triggered	Any detectable structure at all	Test a different perturbation site

The first row is the opening example. The trend test fires: the trajectory is monotone. But the curvature test can't resolve whether the rate is decelerating (heading toward a fixed point) or constant (heading toward infinity). Both look the same in a short window. The failure mode "insufficient samples to distinguish deceleration from drift" names the remedy: collect more data. That edge points to a specific experiment with a specific expected outcome.

The second row: the spectral test detects a peak, but the peak is broad. A narrow peak means a clean oscillation with a definite period. A broad peak means either a noisy cycle (signal contaminated) or colored noise (no true period, just autocorrelation). The failure mode names two candidate hypotheses and the experiment that discriminates: sample at a different frequency, or extend the window until the peak sharpens or dissolves.

The third row: nothing triggered. No monotone trend, no spectral peak, no aperiodic structure. Either the system has no response to this perturbation (dead end) or you perturbed the wrong node. The edge points to a different perturbation site. It says "try elsewhere" without specifying where, but that is still more than "fail to reject."

Successes generate edges too

A successful classification also generates hypotheses. Each label implies a structural claim about the system, and that claim has consequences.

Classification	What it tells you	What to try next
Convergent	Something absorbed the perturbation	Test a different node — this one is stabilized
Divergent	Single point of failure	Test what depends on it — find the blast radius
Oscillatory	Two subsystems are fighting	Test the interface between them
Aperiodic	Input exceeds the architecture’s capacity for structured response	Decompose differently — the current partitioning is wrong

Convergent: something compensated. The perturbation was absorbed. The next question is not "test this node harder" but "test a different node." The absorbing mechanism works; the interesting structure is elsewhere. The label redirects.

Divergent: this node is critical and nothing compensated. What depends on it? If it fails, what else fails? The edge points downstream.

Oscillatory: two constraints are in conflict. The system alternates between satisfying one and violating the other. The edge points to the coupling point between them, not to either subsystem individually.

Aperiodic: the response has structure but no repeating pattern. The current decomposition can't resolve it into periodic components. The edge points to the analyst's modeling choices, not to the system itself.

The hypothesis graph

Apply the kill-condition tree iteratively and it builds a graph. Each node is an observation (experiment + classification). Each edge is a hypothesis (generated by the classification or misfire, pointing to the next experiment).

The algorithm:

Pick a frontier node. (Choose what to test next.)
Perturb the system. (Run the experiment.)
Classify the response. (Run the kill-condition tree.)
Generate edges. (The classification or misfire names the open questions.)
Repeat until the frontier stops expanding.

This loop has been implicit since Chapter 1. Abduction, deduction, induction are phases of one cycle. Abduction generates the hypothesis (step 4). Deduction derives a prediction ("if this is oscillatory, testing the interface should show the coupling"). Induction tests it (step 2). The kill-condition tree closes the loop: observation produces hypothesis produces experiment produces observation.

The same mechanism in four fields

Lakatos: the guilty lemma (1976)

In Proofs and Refutations, Lakatos describes a recurring pattern: a proof is proposed, a counterexample appears, but the counterexample does not kill the theorem. It reveals which step of the proof was wrong. Lakatos calls this the "guilty lemma." The counterexample points to the specific assumption that fails, and that assumption is the site of the next revision.

Kill conditions work the same way. A classification test fires but the result is ambiguous. The ambiguity reveals which assumption was insufficient. That assumption is the guilty lemma. The next experiment targets it. Lakatos was describing mathematics, not empirical science, but the structure is identical: a test fails in a way that names the point of failure, and the failure generates the next move.

Reiter: model-based diagnosis (1987)

Reiter formalized diagnosis as hitting sets over conflicts. A conflict is a set of components that cannot all be working given the observations. A diagnosis is a minimal set whose failure explains all conflicts.

In kill-condition terms: each misfire is a conflict, a set of hypotheses that cannot all hold given the data. The next experiment resolves the conflict by determining which member is faulty. Reiter's hitting-set algorithm is a systematic way to generate the edges.

De Kleer & Williams: GDE (1987)

The General Diagnostic Engine (Chapter 7) computes which measurement maximally discriminates among remaining diagnoses. Kill conditions produce the edges; GDE ranks them by information gain per cost. Together: observe → classify → generate edges → rank edges (GDE) → run the best one → observe again.

Schoenfeld: the proof manual (1985)

Schoenfeld filmed undergraduates solving math problems. The stuck pattern was diagnostic. Trying induction repeatedly on the same base case means the base case is wrong. Failing on a specific step of the contrapositive means that step's assumption is the weak point. The failure mode names the escalation technique.

When induction fails because the residual loses structure, the shape of the lost structure tells you which technique to try next. The shape of failure is the edge.

Code: the kill-condition decision tree

The classifier takes a trajectory (e-values over time) and returns both a label and an edge.

Python

import numpy as np
from dataclasses import dataclass

@dataclass
class Classification:
    label: str           # convergent, divergent, oscillatory, aperiodic, null
    confidence: float    # 0 to 1
    edge: str            # the next experiment suggested by the classification
    misfire: str | None  # if a test was ambiguous, what it couldn't resolve

def test_monotone_trend(trajectory, alpha=0.05):
    """Mann-Kendall-style sign test for monotone trend."""
    n = len(trajectory)
    signs = []
    for i in range(n):
        for j in range(i + 1, n):
            signs.append(np.sign(trajectory[j] - trajectory[i]))
    s = sum(signs)
    total_pairs = n * (n - 1) / 2
    fraction = abs(s) / total_pairs if total_pairs > 0 else 0
    return fraction > 0.6, s > 0, fraction

def test_curvature(trajectory):
    """Test whether a monotone trajectory is decelerating or constant/accelerating."""
    diffs = np.diff(trajectory)
    if len(diffs) < 2:
        return "indeterminate", 0.0
    second_diffs = np.diff(diffs)
    mean_accel = np.mean(second_diffs)
    std_accel = np.std(second_diffs)
    if std_accel == 0:
        return "constant", 1.0
    snr = abs(mean_accel) / std_accel
    if snr < 0.5:
        return "indeterminate", snr
    if mean_accel < 0:
        return "decelerating", snr
    return "accelerating", snr

def test_spectral_peak(trajectory):
    """Test for a dominant spectral peak via FFT."""
    detrended = trajectory - np.linspace(trajectory[0], trajectory[-1], len(trajectory))
    spectrum = np.abs(np.fft.rfft(detrended))[1:]
    if len(spectrum) == 0:
        return False, False, 0.0
    peak_idx = np.argmax(spectrum)
    peak_val = spectrum[peak_idx]
    total_power = np.sum(spectrum)
    if total_power == 0:
        return False, False, 0.0
    concentration = peak_val / total_power
    has_peak = concentration > 0.3
    is_narrow = concentration > 0.5
    return has_peak, is_narrow, concentration

def test_aperiodic(trajectory):
    """Test for aperiodic structure via autocorrelation decay."""
    n = len(trajectory)
    centered = trajectory - np.mean(trajectory)
    var = np.var(trajectory)
    if var == 0:
        return False
    autocorr = np.correlate(centered, centered, mode='full')
    autocorr = autocorr[n-1:] / (var * n)
    mid = len(autocorr) // 4
    if mid == 0:
        return False
    return abs(autocorr[mid]) > 0.1


def classify_trajectory(trajectory):
    """Run the kill-condition decision tree."""
    is_monotone, is_increasing, trend_strength = test_monotone_trend(trajectory)

    if is_monotone:
        curvature, snr = test_curvature(trajectory)
        if curvature == "decelerating":
            return Classification("convergent", min(trend_strength, snr / 2),
                "Stabilized. Test a different node.", None)
        elif curvature == "accelerating" or curvature == "constant":
            return Classification("divergent", min(trend_strength, snr / 2),
                "Load-bearing. Test what depends on this node.", None)
        else:
            return Classification("monotone", trend_strength,
                "Run longer to resolve curvature.",
                f"Monotone trend detected, curvature indeterminate (SNR={snr:.2f}). "
                f"Cannot distinguish deceleration from drift.")

    has_peak, is_narrow, concentration = test_spectral_peak(trajectory)
    if has_peak:
        if is_narrow:
            return Classification("oscillatory", concentration,
                "Two subsystems fighting. Test the interface.", None)
        else:
            return Classification("oscillatory", concentration,
                "Test at different frequency or extend observation window.",
                f"Spectral peak detected but broad (concentration={concentration:.2f}). "
                f"Noisy cycle or colored noise?")

    if test_aperiodic(trajectory):
        return Classification("aperiodic", 0.5,
            "Decompose differently. Current partitioning misses the structure.", None)

    return Classification("null", 0.0,
        "Test a different perturbation site.",
        "No structure detected. Null response or wrong perturbation site.")


# --- Four trajectories, four classifications, four edges ---

trajectories = {
    "decelerating": np.cumsum(1.0 / np.arange(1, 51)),
    "linear drift": np.linspace(0, 10, 50) + np.random.normal(0, 0.5, 50),
    "clean cycle":  np.sin(np.linspace(0, 6 * np.pi, 100)),
    "flat noise":   np.random.normal(0, 0.1, 50),
}

for name, traj in trajectories.items():
    result = classify_trajectory(traj)
    print(f"Trajectory: {name}")
    print(f"  Label:      {result.label}")
    print(f"  Confidence: {result.confidence:.2f}")
    print(f"  Edge:       {result.edge}")
    if result.misfire:
        print(f"  Misfire:    {result.misfire}")
    print()

import numpy as np
from dataclasses import dataclass

@dataclass
class Classification:
    label: str           # convergent, divergent, oscillatory, aperiodic, null
    confidence: float    # 0 to 1
    edge: str            # the next experiment suggested by the classification
    misfire: str | None  # if a test was ambiguous, what it couldn't resolve

def test_monotone_trend(trajectory, alpha=0.05):
    """Mann-Kendall-style sign test for monotone trend."""
    n = len(trajectory)
    signs = []
    for i in range(n):
        for j in range(i + 1, n):
            signs.append(np.sign(trajectory[j] - trajectory[i]))
    s = sum(signs)
    total_pairs = n * (n - 1) / 2
    fraction = abs(s) / total_pairs if total_pairs > 0 else 0
    return fraction > 0.6, s > 0, fraction

def test_curvature(trajectory):
    """Test whether a monotone trajectory is decelerating or constant/accelerating."""
    diffs = np.diff(trajectory)
    if len(diffs) < 2:
        return "indeterminate", 0.0
    second_diffs = np.diff(diffs)
    mean_accel = np.mean(second_diffs)
    std_accel = np.std(second_diffs)
    if std_accel == 0:
        return "constant", 1.0
    snr = abs(mean_accel) / std_accel
    if snr < 0.5:
        return "indeterminate", snr
    if mean_accel < 0:
        return "decelerating", snr
    return "accelerating", snr

def test_spectral_peak(trajectory):
    """Test for a dominant spectral peak via FFT."""
    detrended = trajectory - np.linspace(trajectory[0], trajectory[-1], len(trajectory))
    spectrum = np.abs(np.fft.rfft(detrended))[1:]
    if len(spectrum) == 0:
        return False, False, 0.0
    peak_idx = np.argmax(spectrum)
    peak_val = spectrum[peak_idx]
    total_power = np.sum(spectrum)
    if total_power == 0:
        return False, False, 0.0
    concentration = peak_val / total_power
    has_peak = concentration > 0.3
    is_narrow = concentration > 0.5
    return has_peak, is_narrow, concentration

def test_aperiodic(trajectory):
    """Test for aperiodic structure via autocorrelation decay."""
    n = len(trajectory)
    centered = trajectory - np.mean(trajectory)
    var = np.var(trajectory)
    if var == 0:
        return False
    autocorr = np.correlate(centered, centered, mode='full')
    autocorr = autocorr[n-1:] / (var * n)
    mid = len(autocorr) // 4
    if mid == 0:
        return False
    return abs(autocorr[mid]) > 0.1

def classify_trajectory(trajectory):
    """Run the kill-condition decision tree."""
    is_monotone, is_increasing, trend_strength = test_monotone_trend(trajectory)

if is_monotone:
        curvature, snr = test_curvature(trajectory)
        if curvature == "decelerating":
            return Classification("convergent", min(trend_strength, snr / 2),
                "Stabilized. Test a different node.", None)
        elif curvature == "accelerating" or curvature == "constant":
            return Classification("divergent", min(trend_strength, snr / 2),
                "Load-bearing. Test what depends on this node.", None)
        else:
            return Classification("monotone", trend_strength,
                "Run longer to resolve curvature.",
                f"Monotone trend detected, curvature indeterminate (SNR={snr:.2f}). "
                f"Cannot distinguish deceleration from drift.")

has_peak, is_narrow, concentration = test_spectral_peak(trajectory)
    if has_peak:
        if is_narrow:
            return Classification("oscillatory", concentration,
                "Two subsystems fighting. Test the interface.", None)
        else:
            return Classification("oscillatory", concentration,
                "Test at different frequency or extend observation window.",
                f"Spectral peak detected but broad (concentration={concentration:.2f}). "
                f"Noisy cycle or colored noise?")

if test_aperiodic(trajectory):
        return Classification("aperiodic", 0.5,
            "Decompose differently. Current partitioning misses the structure.", None)

return Classification("null", 0.0,
        "Test a different perturbation site.",
        "No structure detected. Null response or wrong perturbation site.")

# --- Four trajectories, four classifications, four edges ---

trajectories = {
    "decelerating": np.cumsum(1.0 / np.arange(1, 51)),
    "linear drift": np.linspace(0, 10, 50) + np.random.normal(0, 0.5, 50),
    "clean cycle":  np.sin(np.linspace(0, 6 * np.pi, 100)),
    "flat noise":   np.random.normal(0, 0.1, 50),
}

for name, traj in trajectories.items():
    result = classify_trajectory(traj)
    print(f"Trajectory: {name}")
    print(f"  Label:      {result.label}")
    print(f"  Confidence: {result.confidence:.2f}")
    print(f"  Edge:       {result.edge}")
    if result.misfire:
        print(f"  Misfire:    {result.misfire}")
    print()

The second trajectory is the key case. The trend test fires (monotone, confidence 0.89) but the curvature test misfires (SNR = 0.38, below threshold). The classifier does not guess. It reports the misfire and generates the edge: "run longer." A p-value system would report "significant trend, p < 0.01" and stop. The kill condition says: significant trend, yes, but the type of trend is unresolved. More data is the edge.

Building the graph

One classification produces one or two edges. Iterate the loop (classify, follow, classify again) and it builds a graph.

Python

import numpy as np
from dataclasses import dataclass

# --- classify_trajectory (from the block above) ---

@dataclass
class Classification:
    label: str
    confidence: float
    edge: str
    misfire: str | None

def test_monotone_trend(trajectory):
    n = len(trajectory)
    signs = [np.sign(trajectory[j] - trajectory[i])
             for i in range(n) for j in range(i+1, n)]
    s = sum(signs)
    total_pairs = n * (n - 1) / 2
    fraction = abs(s) / total_pairs if total_pairs > 0 else 0
    return fraction > 0.6, s > 0, fraction

def test_curvature(trajectory):
    diffs = np.diff(trajectory)
    if len(diffs) < 2: return "indeterminate", 0.0
    sd = np.diff(diffs)
    ma, sa = np.mean(sd), np.std(sd)
    if sa == 0: return "constant", 1.0
    snr = abs(ma) / sa
    if snr < 0.5: return "indeterminate", snr
    return ("decelerating" if ma < 0 else "accelerating"), snr

def test_spectral_peak(trajectory):
    detrended = trajectory - np.linspace(trajectory[0], trajectory[-1], len(trajectory))
    spectrum = np.abs(np.fft.rfft(detrended))[1:]
    if len(spectrum) == 0: return False, False, 0.0
    peak_val = spectrum[np.argmax(spectrum)]
    total = np.sum(spectrum)
    if total == 0: return False, False, 0.0
    c = peak_val / total
    return c > 0.3, c > 0.5, c

def test_aperiodic(trajectory):
    n = len(trajectory)
    centered = trajectory - np.mean(trajectory)
    var = np.var(trajectory)
    if var == 0: return False
    ac = np.correlate(centered, centered, mode='full')[n-1:] / (var * n)
    mid = len(ac) // 4
    return mid > 0 and abs(ac[mid]) > 0.1

def classify_trajectory(trajectory):
    is_mono, is_inc, strength = test_monotone_trend(trajectory)
    if is_mono:
        curv, snr = test_curvature(trajectory)
        if curv == "decelerating":
            return Classification("convergent", min(strength, snr/2),
                "Stabilized. Test a different node.", None)
        elif curv in ("accelerating", "constant"):
            return Classification("divergent", min(strength, snr/2),
                "Load-bearing. Test what depends on this node.", None)
        else:
            return Classification("monotone", strength,
                "Run longer to resolve curvature.",
                f"Curvature indeterminate (SNR={snr:.2f}).")
    hp, nar, conc = test_spectral_peak(trajectory)
    if hp:
        return Classification("oscillatory", conc,
            "Two subsystems fighting. Test the interface." if nar
            else "Test at different frequency or extend window.", None)
    if test_aperiodic(trajectory):
        return Classification("aperiodic", 0.5,
            "Decompose differently.", None)
    return Classification("null", 0.0,
        "Test a different perturbation site.",
        "No structure detected.")

# --- HypothesisGraph ---

@dataclass
class Node:
    id: int
    experiment: str
    classification: Classification | None = None

@dataclass
class Edge:
    source: int
    target: int
    hypothesis: str

class HypothesisGraph:
    def __init__(self):
        self.nodes: list[Node] = []
        self.edges: list[Edge] = []
        self._next_id = 0

    def add_node(self, experiment: str) -> int:
        node_id = self._next_id
        self._next_id += 1
        self.nodes.append(Node(id=node_id, experiment=experiment))
        return node_id

    def classify_node(self, node_id: int, trajectory):
        classification = classify_trajectory(trajectory)
        self.nodes[node_id].classification = classification
        target_id = self.add_node(experiment=classification.edge)
        self.edges.append(Edge(
            source=node_id, target=target_id,
            hypothesis=classification.misfire or classification.label
        ))
        return target_id

    def frontier(self) -> list[Node]:
        return [n for n in self.nodes if n.classification is None]

    def summary(self):
        print(f"Nodes: {len(self.nodes)}  Edges: {len(self.edges)}  "
              f"Frontier: {len(self.frontier())}")
        for e in self.edges:
            src = self.nodes[e.source]
            tgt = self.nodes[e.target]
            label = src.classification.label if src.classification else "?"
            print(f"  [{src.id}] {src.experiment[:40]} ({label})")
            print(f"    -> [{tgt.id}] {tgt.experiment[:50]}")


# --- Run the loop ---
graph = HypothesisGraph()

# Round 1: perturb node A, observe monotone trajectory
n0 = graph.add_node("Perturb node A")
traj_0 = np.linspace(0, 5, 30) + np.random.normal(0, 0.8, 30)
n1 = graph.classify_node(n0, traj_0)

# Round 2: follow the edge — run longer
traj_1 = np.cumsum(1.0 / np.arange(1, 101))
n2 = graph.classify_node(n1, traj_1)

# Round 3: convergent — edge says test a different node
traj_2 = np.sin(np.linspace(0, 4 * np.pi, 80))
n3 = graph.classify_node(n2, traj_2)

graph.summary()

import numpy as np
from dataclasses import dataclass

# --- classify_trajectory (from the block above) ---

@dataclass
class Classification:
    label: str
    confidence: float
    edge: str
    misfire: str | None

def test_monotone_trend(trajectory):
    n = len(trajectory)
    signs = [np.sign(trajectory[j] - trajectory[i])
             for i in range(n) for j in range(i+1, n)]
    s = sum(signs)
    total_pairs = n * (n - 1) / 2
    fraction = abs(s) / total_pairs if total_pairs > 0 else 0
    return fraction > 0.6, s > 0, fraction

def test_curvature(trajectory):
    diffs = np.diff(trajectory)
    if len(diffs) < 2: return "indeterminate", 0.0
    sd = np.diff(diffs)
    ma, sa = np.mean(sd), np.std(sd)
    if sa == 0: return "constant", 1.0
    snr = abs(ma) / sa
    if snr < 0.5: return "indeterminate", snr
    return ("decelerating" if ma < 0 else "accelerating"), snr

def test_spectral_peak(trajectory):
    detrended = trajectory - np.linspace(trajectory[0], trajectory[-1], len(trajectory))
    spectrum = np.abs(np.fft.rfft(detrended))[1:]
    if len(spectrum) == 0: return False, False, 0.0
    peak_val = spectrum[np.argmax(spectrum)]
    total = np.sum(spectrum)
    if total == 0: return False, False, 0.0
    c = peak_val / total
    return c > 0.3, c > 0.5, c

def test_aperiodic(trajectory):
    n = len(trajectory)
    centered = trajectory - np.mean(trajectory)
    var = np.var(trajectory)
    if var == 0: return False
    ac = np.correlate(centered, centered, mode='full')[n-1:] / (var * n)
    mid = len(ac) // 4
    return mid > 0 and abs(ac[mid]) > 0.1

def classify_trajectory(trajectory):
    is_mono, is_inc, strength = test_monotone_trend(trajectory)
    if is_mono:
        curv, snr = test_curvature(trajectory)
        if curv == "decelerating":
            return Classification("convergent", min(strength, snr/2),
                "Stabilized. Test a different node.", None)
        elif curv in ("accelerating", "constant"):
            return Classification("divergent", min(strength, snr/2),
                "Load-bearing. Test what depends on this node.", None)
        else:
            return Classification("monotone", strength,
                "Run longer to resolve curvature.",
                f"Curvature indeterminate (SNR={snr:.2f}).")
    hp, nar, conc = test_spectral_peak(trajectory)
    if hp:
        return Classification("oscillatory", conc,
            "Two subsystems fighting. Test the interface." if nar
            else "Test at different frequency or extend window.", None)
    if test_aperiodic(trajectory):
        return Classification("aperiodic", 0.5,
            "Decompose differently.", None)
    return Classification("null", 0.0,
        "Test a different perturbation site.",
        "No structure detected.")

# --- HypothesisGraph ---

@dataclass
class Node:
    id: int
    experiment: str
    classification: Classification | None = None

@dataclass
class Edge:
    source: int
    target: int
    hypothesis: str

class HypothesisGraph:
    def __init__(self):
        self.nodes: list[Node] = []
        self.edges: list[Edge] = []
        self._next_id = 0

def add_node(self, experiment: str) -> int:
        node_id = self._next_id
        self._next_id += 1
        self.nodes.append(Node(id=node_id, experiment=experiment))
        return node_id

def classify_node(self, node_id: int, trajectory):
        classification = classify_trajectory(trajectory)
        self.nodes[node_id].classification = classification
        target_id = self.add_node(experiment=classification.edge)
        self.edges.append(Edge(
            source=node_id, target=target_id,
            hypothesis=classification.misfire or classification.label
        ))
        return target_id

def frontier(self) -> list[Node]:
        return [n for n in self.nodes if n.classification is None]

def summary(self):
        print(f"Nodes: {len(self.nodes)}  Edges: {len(self.edges)}  "
              f"Frontier: {len(self.frontier())}")
        for e in self.edges:
            src = self.nodes[e.source]
            tgt = self.nodes[e.target]
            label = src.classification.label if src.classification else "?"
            print(f"  [{src.id}] {src.experiment[:40]} ({label})")
            print(f"    -> [{tgt.id}] {tgt.experiment[:50]}")

# --- Run the loop ---
graph = HypothesisGraph()

# Round 1: perturb node A, observe monotone trajectory
n0 = graph.add_node("Perturb node A")
traj_0 = np.linspace(0, 5, 30) + np.random.normal(0, 0.8, 30)
n1 = graph.classify_node(n0, traj_0)

# Round 2: follow the edge — run longer
traj_1 = np.cumsum(1.0 / np.arange(1, 101))
n2 = graph.classify_node(n1, traj_1)

# Round 3: convergent — edge says test a different node
traj_2 = np.sin(np.linspace(0, 4 * np.pi, 80))
n3 = graph.classify_node(n2, traj_2)

graph.summary()

Three rounds, three edges. The first classification misfired (curvature indeterminate) and generated "run longer." The second succeeded (convergent) and redirected: "test a different node." The third succeeded (oscillatory) and targeted the coupling point. Each step followed from the previous result.

The frontier has three open nodes: three experiments suggested but not yet run. Each is a specific action generated by a specific kill condition.

Convergence is not guaranteed

Each kill condition generates one edge. Three rounds produced three edges and three frontier nodes. Ten rounds produce ten edges and up to ten new frontier nodes. The graph grows. Does it close?

Convergence means every frontier edge points to a node already tested and stably classified. With finite structure and stable classifications, the graph should close. But "should" is not a proof. Feedback can create cycles: convergent at node A points to node B, which oscillates, which points back to A with new data that changes A's classification. Classifications are data-dependent.

The pieces for a convergence proof exist independently: Chernoff (1959) proved adaptive experiment selection converges; Grünwald (2024) proved e-values compose across adaptive experiments. Connecting them to the hypothesis graph is open. That is Chapter 11.

Neighbors

🔭 Methodeutics
Ch 7: Economy of Research — which experiments to run first
The Hypothesis Graph — the blog post with the full argument
Ch 8: Evidence has a trajectory — e-values and temporal structure

External