Caches All the Way Down

Part of the cognition series.

In software, we say “everything’s a wrapper.” An ORM wraps SQL, which wraps disk I/O, which wraps silicon. Each layer exposes the same four verbs (create, read, update, delete) and delegates to the layer below. Wrappers all the way down.

But CRUD is only the Transmit interface. Store, retrieve, update, delete: that’s the API to the persistent store. When we say “wrapper” we’re seeing one role out of six and calling it the whole thing.

The rest of the pipeline

SQL has WHERE (Filter) and ORDER BY (Attend). The ORM above it has scopes (Filter) and eager loading (Attend). The API above that has authorization (Filter) and pagination (Attend). The frontend above that has conditional rendering (Filter) and sort/highlight (Attend).

Every layer re-implements the full pipeline over the data from the layer below. We say “just a wrapper” because Transmit looks identical at every level. The other roles look different, so we don’t notice the repetition.

The computing stack

Digital computing is the clearest case because we built it from the floor up. The entire hardware-software stack is the Cache tower made visible, each level adding capacity until the full pipeline emerges.

Level	Cache capacity	Perceive	Cache	Filter	Attend	Consolidate	Transmit
Transistor	1 bit	Voltage	—	Threshold gate	—	—	—
Logic gate	few bits	Input lines	Transistors	Boolean function	—	—	Output line
ALU	word	Operands, opcode	Logic gates, registers	Overflow, flags	Opcode selects operation	—	Result register
CPU	KB (L1)	Fetch instruction	ALUs, pipeline stages	Branch prediction	Scheduling, out-of-order	Branch predictor learns	Register file, L1 cache
OS	GB (RAM)	Interrupts, I/O	CPUs, memory hierarchy	Cache eviction	Scheduler dispatch	Defrag, compaction	Filesystem, swap
Database	TB (disk)	Query arrives	OS filesystem, B-trees	WHERE clause	ORDER BY, LIMIT	VACUUM, reindex	The table on disk
Backend	app memory	Request arrives	Database, ORM	Auth, validations	Pagination, sorting	Schema migrations	Database write
Frontend	viewport	User event	Backend responses, DOM	Conditional rendering	Sort, highlight, focus	User preferences	localStorage, DOM state
Container	image layers	Build context	Applications, runtime	.dockerignore, multi-stage discard	Layer ordering for cache hits	Image optimization	Image in registry
Kubernetes	cluster	Desired state, metrics	Containers, etcd	Admission controllers, resource quotas	Scheduler: affinity, constraints	Operator reconciliation loops	Cluster state
Autoscaler	fleet	CPU, memory, request rate	Kubernetes clusters	Cooldown periods, min/max bounds	Scaling policy: which pool, how much	Policy tuning from history	The running fleet

The transistor row: one bit, pure threshold gating, no Attend, no Consolidate. The bool store. The autoscaler row: full pipeline across a fleet. Each row between them added capacity, and each time it crossed a threshold, another role filled in.

Nobody designed it this way. Engineers at each level solved their local problem (“hold more items, select among them, rank the survivors”) and arrived at the same pipeline independently. Add storage, and Filter and Attend follow.

The biological stack

The same tower in a person’s energy storage. Each level caches the level below.

Level	Cache capacity	Perceive	Cache	Filter	Attend	Consolidate	Transmit
ATP	1 bond	Substrate arrives	—	Enzyme lock-and-key	—	—	—
Mitochondrion	many ATP molecules	Pyruvate, O₂	ATP molecules	Membrane potential threshold	Uncoupling proteins	—	ATP output rate
Cell	glycogen granules	Glucose, insulin signal	Mitochondria	Metabolic gating (hexokinase)	Energy allocation across processes	Gene expression	Glycogen, protein
Liver	~100g glycogen	Blood glucose, hormones	Cells, hepatocytes	Glucokinase threshold	Glycogenesis vs gluconeogenesis	Metabolic adaptation	Blood glucose level
Adipose / muscle	kg of fat, kg of protein	Insulin, excess energy	Liver, circulating glucose	Lipogenesis threshold	Which depots to mobilize	Set point adjustment	Fat mass
Mammal	total reserves	Hunger, satiety signals	Adipose, muscle	Ghrelin, leptin, appetite regulation	Meal choice, macronutrient balance	Metabolic adaptation, microbiome	Body composition

ATP: one phosphate bond, pure enzyme gating. The bool store. The mammal row: full pipeline with hunger, choice, and metabolic adaptation. Same tower. Capacity grows, roles fill in. Evolution built each level because the one below couldn’t manage energy at the scale above.

Two substrates. Same staircase. Each level’s Cache is the level below, and each added enough capacity for another role to fill in. The shape repeats because the constraint forces it. The constraint also forces it to stop.

The tower has a floor

By induction on storage capacity.

Base case. A Cache with one bit of storage is a boolean. Pass or reject. Selection requires at least two items; one slot has nothing to compare. The only operation is threshold gating: a single if. No Attend (nothing to rank), no Consolidate (nothing to learn). The pipeline collapses to Filter alone.

Inductive step. A Cache at depth d with capacity S may contain a sub-pipeline whose sub-Cache at depth d+1 has capacity S’. Boundary 1 applies: the sub-Cache must fit inside the parent, so S’ < S by pigeonhole. Strictly decreasing.

Termination. Capacity is a natural number. A strictly decreasing sequence of naturals terminates. It reaches 1 bit. The tower has finite depth.

How deep is the universe’s cache? As deep as physics allows — down to whatever distinction is smallest. A qubit. A Planck bit. The bool store at the bottom of everything.

Three nested pipelines — CPU, ALU, Logic Gate — each Cache zooms into the full pipeline of the level below. At the bottom, transistors reduce to a bool store: 0 or 1. The tower terminates.

The Handshake proves the analogous result for Consolidate: induction on bit budget, with the data processing inequality as the decreasing measure, terminating at passthrough. Cache’s tower uses storage capacity instead, terminating at the bool store. Same structure, different measures. Consolidate is about compression. Cache is about capacity.

Bool stores in the wild

At the floor of every Cache tower, you should find a bool store doing threshold gating. And you do.

Ion channels: open or closed. One bit. Voltage threshold gates molecules through. No ranking, no learning. Pure Filter.

Transistors: on or off. Voltage above threshold passes the signal. Below, it blocks.

MHC binding: fits or doesn’t. Antigen presentation at the molecular level is a shape match — binding affinity is graded, but the groove either holds the peptide or releases it. Ranking among candidates happens one level up, where limited surface slots force selection among the fragments that passed.

Each is a Cache collapsed to a boolean. The prediction: below a bool store, no further self-similarity. You can’t have a sub-pipeline inside an if statement. If you found something smaller than a bool still doing selection, the argument would be falsified. But a bool is the minimum unit of distinction.

The AI stack

The same tower for AI. Read the dim cells.

Level	Cache capacity	Perceive	Cache	Filter	Attend	Consolidate	Transmit
Weight	1 float	Gradient	—	Learning rate threshold	—	—	—
Neuron	~hundreds of weights	Input vector	Weights	ReLU, activation	—	Backprop (offline, sealed)	Activation output
Attention head	~millions params	Query, key, value	Neurons	Softmax masking	Attention scores	Training (sealed)	Weighted value
Block	attention heads	Residual input	Attention heads	Layer norm	Multi-head selection	Training (sealed)	Block output
Model	billions of params	Token sequence	Blocks, KV cache	No input gating	No diversity enforcement	Training (sealed)	Next token
Context window	~128K tokens	User prompt, tool results	Model	Minimal redundancy inhibition	Recency bias, no DPP	Ephemeral — dies with the session	Response
Agent	context + tools	Task, codebase	Context windows	File selection heuristics	Context window selection	Skill creation, memory files	Completed task
Swarm	fleet of agents	Workload	Agents	Task routing	Load balancing	No shared learning	No collective memory

The forward pass is well-optimized at the bottom: softmax is genuine Filter, attention scores genuine Attend. But Consolidate is dim at every level. Training is sealed, so the model learns nothing from its conversation and the context window dies with the session. The agent’s memory files are a bandage, not a schema.

Above the model level, almost everything is dim. The context window has minimal redundancy inhibition; every token gets in until the window is full. The agent selects files by heuristic, not competition. The swarm has no shared memory, no collective consolidation.

The computing stack filled in its dim cells over sixty years. The biology stack filled them in over four billion. The AI stack is a few years old and it shows. The dim cells are the roadmap.

The diagnostic

Active Consolidate within a Cache means there’s at least one more level below. Passthrough means you’ve hit the floor. The query optimizer learns from execution statistics, so it contains another pipeline. Ion channels don’t learn. The gate is the gate.

Every thin wrapper that’s genuinely CRUD passthrough either stays thin (it was at the floor, with nothing to learn) or grows filter and attend logic (it was above the floor, and usage pressure forced the missing roles in). Every ORM starts thin. The ones above the floor never stay that way.

Scan bottom-up. Read the columns like Jenga: a dim cell with solid cells below it is a block you can pull — the tower holds. The layer below already handles the role at higher resolution; the dim cell is delegation, not deficiency. Filling it in duplicates work at coarser grain — premature optimization. But a dim cell with nothing below it is a block missing from the base. The real gaps are the columns that are dim from floor to ceiling.

It’s not wrappers all the way down. It’s pipelines — until you hit the bool.

Written via the double loop. More at pageleft.cc.