Embedding Pipe

Part of the cognition series. Builds on The Parts Bin.

The pipe

An embedding pipeline processes items through six stages. Each stage has a contract. The machine-readable catalog is in _data/parts-bin.yml under data_structure: embedding_space.

StageWhat it doesCommon implementations
PerceiveProduce vectors from raw inputCLIP, sentence transformers, contrastive learning
CacheMaintain a searchable index over vectorsHNSW, IVF-PQ, ball tree
FilterRetrieve a candidate set, strictly smaller than the indexc-ANN search, ε-approximate range search
AttendRerank and diversify under a budgetMMR, k-center / farthest-first traversal
ConsolidateUpdate the embedding model or retrieval policy from outcomesTriplet-loss fine-tuning, online k-means, Growing Neural Gas
RememberPersist artifacts across runs (index, model, metadata)FAISS index serialization, product quantization, checkpoint save

Cache builds and searches the live index. Remember persists it to disk so the next run doesn’t start from scratch. Cache is the in-memory structure. Remember is the durable snapshot.

Filter grid

Embedding space Filter, selection semantics × error guarantee. The similarity row is the strongest. Predicate and dominance are secondary. Causal is an open research direction.

ExactBoundedProbabilistic
SimilarityExact k-NNc-ANN (cover tree, HNSW)LSH ANN
PredicateMetric range searchε-approximate range searchLSH range query
CausalOpen: geometry-aware interference estimation (Leung 2022) and FDR-controlled causal selection (Duan et al. 2024) exist separately. No known composition for embedding-distance-defined interference with bounded FDR.

Attend grid

Embedding space Attend. The top-k row is where agents spend most of their time.

No diversityImplicitExplicit
Top-k slatek-NN retrievalMMRk-center / farthest-first
Single best1-NNMedoidFarthest-point sampling

Example: article feed

A concrete embedding pipe for surfacing fresh articles from an RSS-like feed:

  1. Perceive: embed each new article with a sentence transformer.
  2. Cache: add to an HNSW index.
  3. Filter: for each candidate, compute distance to nearest existing article in the corpus. Reject if below a novelty threshold (density-based filtering).
  4. Attend: from survivors, pick top-k by relevance × diversity using MMR. The similarity penalty is cosine distance in the embedding space.
  5. Consolidate: track which articles the user reads. Fine-tune the embedding or adjust the novelty threshold.
  6. Remember: serialize the HNSW index and read-history to disk.

The Filter step inverts the usual ANN query: instead of finding items close to a query, it rejects items close to existing coverage. The Attend step is standard MMR. The Consolidate step closes the loop.

How embedding space differs from flat

Flat pipelines process records by attribute. Embedding pipelines process records by position in a learned space. The dominant primitives shift:

Embedding pipelines also use predicate filtering (metadata filters alongside ANN), but the geometric operations are the ones that distinguish the pipe.


Written via the double loop.