The Parts Bin

Part of the cognition series. Builds on The Handshake.

Near-misses

The Natural Framework derives six steps from temporal flow and bounded storage. The Handshake gives each step a contract: precondition and postcondition. The CS textbook is full of operations that almost satisfy them. Almost is the diagnosis.

PageRank is the first specimen. Its postcondition says “ranked by authority” but the Attend contract requires diversity and a bound. It made a lot of money regardless. A broken step doesn’t kill instantly; it compounds.

Google bolted on re-ranking, topic diversity, and freshness signals over two decades: incremental upgrades toward a contract-preserving morphism, one patch at a time.

Quicksort is the second specimen. It satisfies order, the most visible guarantee. But the Attend contract also requires diversity (survivors are dissimilar) and boundedness (output is finite top-k, not a total order). Quicksort is the default because order is the only guarantee most systems measure.

Most familiar algorithms are near-misses. They satisfy some guarantees but not all. Noticing which guarantee is missing. That’s the diagnostic power.

Degenerate cases are the other edge. A chatbot has no policy store: nil Filter, nil Attend, nil Consolidate, nil Remember. Token in, token out, same rate. That is passthrough, predicted by the existence proofs when policy is zero.

Union-find has no selection: every element is kept, partitions only merge, the structure grows monotonically. No competitive core, no lossy step, no compression. Both are useful. Neither learns.

Diagnostic resolution

The monad is the container — it doesn’t break. What breaks is a morphism’s postcondition. What cascades is the composition. The derivation forces contracts of this shape, and the induction proof shows they compose forever if they compose once.

Step N+1’s precondition is step N’s postcondition. A correct algorithm with a broken precondition is a correct algorithm that produces garbage.

Governance operates at zero resolution. It can’t see the interface, so it replaces the whole composition: fire everyone, rewrite from scratch, new system. Expensive. The framework increases resolution: instead of “the system is broken,” the diagnosis is “Attend’s diversity guarantee is missing.” The parts bin increases it further: instead of “Attend is broken,” the prescription is “this is a top-k sort where you need MMR re-ranking.” Swap one operation, same slot, contract restored.

The most efficient fix requires the most precise name. The most feared enemy is one who cannot be named. The handshake is a naming system.

Diagnosis LLM took a couple of hours. Three layers, six steps each, SOAP notes with six-component plans. That precision came from the framework, not domain expertise in ML. Before germ theory: “the patient has bad air.” After: “streptococcus, here’s penicillin.” The framework is the microscope. The handshake is why the microscope works.

Agent

The framework is the diagnostic manual. The parts bin, once ordered, is the pharmacy. The handshake is why the prescriptions compose. What would it look like to use them systematically?

Describe. A product manager says: “users sign up but never come back.” An agent maps this to the six steps. Cache works. Users arrive and data is stored. Filter is missing. Users get everything, keep nothing. Consolidate is nil. Nothing changes between sessions.

Diagnose. The agent isolates Filter. Then drills deeper: is it the precondition (wrong input from upstream), the operation (wrong mechanism), the postcondition (contract not satisfied), the fidelity (contract satisfied but too lossy), or the scale (right operation, wrong timescale)?

Prescribe. The agent queries the taxonomy. Filters by: matching precondition, matching postcondition, sufficient fidelity, compatible scale. Returns a ranked list of candidate operations from across domains. “Your Filter slot needs: output strictly smaller, criterion applied uniformly. Candidates from the parts bin, ordered by fidelity and cost.”

Validate. The agent checks that the prescribed operation’s postcondition matches the next step’s precondition. If not, it flags the interface mismatch before you build it.

The doctor doesn’t need to understand category theory. They need to read the contracts: precondition, postcondition, fidelity. The handshake is the pharmacology. The taxonomy is the PDR. The agent is the resident who can look things up fast.

Catalog

The full catalog follows. Each entry is an operation: input in, output out. If the precondition and postcondition match the contract, the operation fits the slot. Filter decides per-item admissibility: does this item pass the criterion? Attend decides how admitted items relate to each other: order, diversity, and bound are slate-level properties.

Perceive (raw → encoded) — the column every system gets right, because nothing else works until it does.

OperationPreconditionPostcondition
Lexical analysisRaw byte streamToken sequence, parseable
Parsing (LL/LR)Token stream, conforms to grammarAST with explicit structure, traversable
JSON parsingRaw string, well-formedStructured object, addressable by key
A/D conversionContinuous analog signalDiscrete samples, quantized
SIFT descriptor extractionRaw pixel gridKeypoint descriptors, matchable

Cache (encoded → indexed) — the most studied column. Idreos (2018) built a periodic table from five design primitives.

OperationPreconditionPostcondition
Hash indexingRecords with stable keysKeyed index, exact retrieval by key
B-tree index constructionRecords with ordered keysBalanced index, retrieval + range queries
Trie insertionString keys over finite alphabetPrefix-indexed, retrieval by string or prefix
Inverted index constructionTokenized corpus with document IDsPosting lists, retrieval by term
LSM-tree flushSorted runs in memoryPersistent key-value index, retrievable after compaction
Skip-list indexingOrdered entriesProbabilistic index, O(log n) retrieval

Filter (indexed → selected, strictly smaller) — gates the data store, where most systems use exact predicates. The derivation proves a gate must exist whenever outputs are a proper subset of inputs.

OperationPreconditionPostcondition
Predicate selection (WHERE)Indexed relation + boolean predicateSubset matching predicate, strictly smaller
Range queryOrdered index + interval boundsSubset within interval, strictly smaller
Threshold filteringScored items + threshold tSubset meeting threshold, strictly smaller
Regex extractionString corpus + patternMatching spans retained, non-matches discarded
k-NN radius pruningMetric index + query + radius rSubset within radius, strictly smaller
Pareto filteringCandidates with objective vectorsNon-dominated subset, strictly smaller

Attend (selected → ranked, diverse, bounded) — reads the policy store: given the survivors, which are worth pursuing? Control separates from data (derived). Most ranking algorithms satisfy order but miss diversity and bound.

OperationPreconditionPostcondition
MMR re-rankingCandidates + relevance scores + similarity measureTop-k ordered, diversity penalized, bounded
DPP top-k selectionCandidates + relevance weights + similarity kernelTop-k ranked, mutually dissimilar, bounded
xQuAD re-rankingCandidates + relevance + subtopic coverageTop-k ordered, aspect coverage explicit, bounded
Submodular maximizationCandidates + submodular utility (relevance + coverage)Top-k greedy-ranked, diminishing-return diversity, bounded
Diversified beam searchStepwise expansions + diversity penaltyTop-b retained, non-redundant alternatives, bounded

Near-misses (diagnostic counterexamples):

Consolidate (ranked → compressed, changes future processing) — the write interface to the policy store. Compaction reorganizes a cache; consolidation changes how the system processes the next cycle.

I-Con (2025) built a periodic table for this column. A blank cell predicted a new algorithm that beat the state of the art.

OperationPreconditionPostcondition
Gradient descent updateLoss contributions + current weightsWeights updated, future predictions altered
Bayesian posterior updatePrior parameters + weighted observationsPosterior compressed, future inference altered
K-means updateWeighted points + codebook size kk prototypes replacing many points, lossy
Incremental PCAObservations in high dimensionLow-rank basis, future projection altered
Decision tree inductionRanked labeled examplesCompact rule set, future classification altered
Prototype condensationRanked candidates + compression budgetSmall exemplar set, lossy approximation for future matching

Remember (compressed → persisted) — the strongest column. Lossless relative to its input: the contract is “no additional loss at this step.” A database row is Remember for the database pipe but Cache for the CRM pipe. A log entry is Remember for the logger but Cache for the monitoring pipe.

If the thing being persisted is a representation rather than the final entity, it’s Cache at this level, not Remember. The discipline: list write operations only.

OperationPreconditionPostcondition
WAL append + fsyncSerialized state recordDurable on crash, recoverable next cycle
Transaction commitValidated write setPersisted, visible for future reads
Git object write + commitContent-addressed objects + manifestDurable commit graph, retrievable by hash
Checkpoint serializationIn-memory model/statePersisted checkpoint, loadable on next run
Copy-on-write snapshot commitConsistent compressed state imagePersistent snapshot, addressable by version
SSTable flushImmutable key-value run in memoryDurable on-disk run, retrievable by key

Grid

The catalog is a list. A list lets you browse. Browsing doesn’t scale. You need an index. The index needs axes.

Take Filter. Two axes, selection semantics vs. error guarantee:

ExactBounded approximationProbabilistic
PredicateWHERE, range queryThreshold filtering (soft margin)??
SimilarityExact NN pruningk-NN radius pruningLSH filtering
DominancePareto filtering????

The empty cells are predictions. Probabilistic predicate filtering: a randomized classifier used as a gate, with a known false-positive rate. Bounded dominance: approximate Pareto filtering that trades exactness for speed in high dimensions. Each empty cell is a typed interface with known neighbors.

Take Attend. Lay operations on output form vs. redundancy control:

NoneImplicitExplicit
Top-k slateHeap top-kBeam searchMMR, DPP top-k, xQuAD
Single bestargmaxTournament selection??
Path/treeDijkstra, A*MCTS??

The right column is sparse. CS built ranking algorithms for decades and almost never baked redundancy control into the postcondition. It was bolted on after.

The gap predicts: concurrent stochastic tree search. Spawn threads with different random seeds; stochasticity encourages divergence. Budget kills at deadline; the final selection picks the best from a diverse pool. Portfolio SAT solvers do this. Biological evolution does this with mutation rate as the stochastic dial.

The grid narrows the search space enough that a dart throw produces a plausible candidate. Mendeleev didn’t synthesize germanium. He drew the grid, pointed at the gap, and said “something goes here with these properties.”

Future work

The parts bin has order we haven’t discovered yet. Within each column, operations form a spectrum ordered by guarantee strength, cost, determinism, scale. Like genes classified by observable function rather than nucleotide sequence, morphisms are classified by their contracts rather than their implementation. These gaps will predict operations that should exist but haven’t been built yet. The periodic table didn’t just organize chemistry. It created it.

The derivation establishes contracts. Two things are ready now:

Three need formalization work. The existence proofs and types are defined; the composition proofs are sketched:

One needs new theory. The derivation is for linear pipelines only:

Every gap in the bin is an almost that hasn’t been named yet.


Written via the double loop.