Introduction to the Soar Cognitive Architecture
John E. Laird · 2022 ·
arXiv:2205.03854
The most complete symbolic cognitive architecture. Four memories, four learning modules, one decision cycle. Forty years of building the whole mind, one wall at a time.
What Soar is
Soar is not a model. It is a cognitive architecture, a fixed set of mechanisms that an agent uses to perceive, decide, act, and learn across any domain. The architecture does not change per task. The knowledge does.
Allen Newell proposed in 1990 that cognitive science needed unified theories, not isolated models. Soar is the most sustained attempt to build one. It started in 1983 and has been continuously extended for four decades.
The decision cycle
Every Soar agent runs the same top-level loop:
- Input: perception writes to working memory
- Elaboration: production rules fire in parallel waves. First situation elaboration, then operator proposals, then operator evaluations
- Decision: the decision procedure processes preferences. Rejections first, ranking only if needed
- Application: the selected operator fires its actions
- Output: motor commands and memory store commands
The elaboration phase is the key insight. Rules fire in causally dependent waves: evaluation cannot fire until proposals exist. Rejection before ranking, implemented through data dependencies rather than explicit phases. The result is the same as if Filter and Attend ran sequentially, but it emerges from parallel rule firing.
Four memories
| Memory | Contents | Learning |
|---|---|---|
| Working Memory | Current situation: perceptual input, goals, intermediate results. Graph of attribute-value elements. Truth maintenance auto-retracts unsupported elements. | None (by design: it is the scratchpad) |
| Procedural Memory | Production rules (if-then). The RETE network matches incrementally: cost proportional to change, not to total rules. All matched rules fire in parallel. | Chunking + RL (the only store with automatic learning) |
| Semantic Memory | Long-term facts: graph structures in SQLite. Retrieval uses activation (recency + frequency + spreading) borrowed from | Manual only. No automatic learning mechanism. |
| Episodic Memory | Snapshots of working memory, stored as change-deltas. One episode per decision cycle. Retrieval by partial cue matching. | Automatic but undiscriminating. Stores everything. |
Plus the Spatial-Visual System (SVS): a 3D scene graph for spatial reasoning, with top-down filters that control what gets symbolized.
Four learning modules
| Module | What it does | Limitation |
|---|---|---|
| Chunking | When a substate resolves an impasse, backtrace the dependency chain and compile the result into a production rule. Deliberation becomes reaction. | Requires deterministic substates. Cannot chunk over RL decisions. |
| Reinforcement Learning | Adjust numeric preferences on operators to maximize future reward. Applies to every active substate. Natural hierarchical RL. | Global parameters fixed at initialization. Delta-bar-delta adapts per-rule rates. |
| Semantic Learning | The agent can store facts to semantic memory at any time. | No automatic mechanism. Raw store() call. Laird rates it "missing." |
| Episodic Learning | Automatically stores a snapshot at the end of every decision cycle. | No discrimination. Every cycle is recorded. No generalization. |
The impasse mechanism
When the decision procedure lacks sufficient knowledge to select an operator, Soar creates a substate, a new context where the same decision cycle runs recursively. The substate has full access to all reasoning capabilities. This single mechanism unifies:
- Planning (operator no-change impasse → look-ahead search)
- Hierarchical decomposition (operator subgoaling)
- Metacognition (reasoning about one's own reasoning)
- Deliberate evaluation (comparing operators by simulation)
Most architectures bolt these on as separate modules. Soar derives them from one mechanism.
The pattern
Every memory has a functional forward pass. Procedural memory is the only store with automatic learning. The other stores (semantic, episodic, perceptual) have no learning mechanism that writes back to them.
Laird's own assessment: "What I feel is most missing from Soar is its ability to 'bootstrap' itself up from the architecture and a set of innate knowledge into being a fully capable agent across a breadth of tasks."
Demonstrated systems
- Rosie: learns 60+ tasks from real-time natural language instruction
- TacAir-Soar: 8,000 rules, military simulation, large-scale distributed environments
- 20+ real-world robots with real-time decision-making and spatial reasoning
- Agents that have run uninterrupted for 30 days
- Decision cycle of ~50ms with millions of knowledge elements
What Soar gives back
The RETE network is the cache that LLMs don't have. It tracks every partial match incrementally: cost proportional to what changed, not to total knowledge. Transformers recompute attention over the entire context every pass. KV-caching helps but doesn't approach RETE's efficiency. When agent frameworks bolt on memory systems like MemGPT, they're rebuilding what RETE already does natively.
When a rule's conditions no longer hold, truth maintenance automatically retracts the structures it created. No garbage, no stale data. LLM agents accumulate context monotonically. MemAgent had to learn a separate policy for what to forget. Soar does it architecturally. Minsky's censors and suppressors are the informal version: negative expertise that removes what shouldn't be there.
How does Soar handle Attend? Staged preferences: process rejections first. If rejection resolves the decision, skip ranking entirely. Softmax attention always computes a full distribution over all positions. Soar asks: can I eliminate most candidates before ranking? Same insight as Minsky's cross-exclusion, but with a principled two-phase protocol.
Then there's chunking, the Consolidate that the LLM stack lacks. Soar deliberates in a substate, backtraces the dependency chain, compiles the result into a production rule. Next time, it fires directly. LLMs have no equivalent. Every inference recomputes from scratch. Reflexion is the closest attempt: store a critique in natural language, condition the next attempt on it. Reflexion stores text; chunking compiles executable rules. That gap between verbal and parametric learning is the central unsolved problem.
Finally, delta episodic storage. Store only changes between snapshots. Generative Agents store full natural-language episodes, expensive and redundant. Soar's delta representation keeps costs proportional to novelty. The
prescription adds a surprise gate: store only the episodes that deviate from predictions.
Deep dive:
Diagnosis: Soar ·
Prescription: Soar · back to collection