Files
2026-04-19 16:16:13 +02:00

825 lines
49 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Festinger — Agent0 Inference Middleware
**Status:** In progress — iterative specification
**Owner:** jenstandstad
**Location:** `plugins/festinger/`
> Named after Leon Festinger (19191989), social psychologist who introduced the theory of cognitive dissonance in 1957. Festinger observed that minds — human or artificial — cannot comfortably hold contradictory beliefs simultaneously, and that the tension this creates drives resolution. This system is built on the same principle.
---
## Purpose
Festinger is an Ollama-compatible HTTP proxy that sits between Agent0's agent-zero containers and the local Ollama inference endpoint. It solves two related problems that emerge with local inference:
1. **Reasoning loops** — agents repeat the same output and cannot break out, even when the framework tells them to try something else.
2. **Stale and incoherent memory** — Agent Zero's FAISS-based memory accumulates facts without contradiction detection, causing agents to act on outdated or conflicting beliefs.
Festinger addresses both at the inference layer, transparently, without modifying agent-zero internals. The memory layer it introduces is called **Recollections** — short, structured, non-contradicting facts injected spontaneously into every prompt as context enrichment. Agents do not search for recollections; they appear automatically.
Like its namesake's theory, Festinger treats contradiction not as an error to suppress but as a signal to act on.
---
## Architecture
```
agent-zero containers
┌───────────────────────────────────────────┐
│ Festinger Proxy │
│ │
│ ┌─────────────┐ ┌──────────────────┐ │
│ │ Loop │ │ Saliency Engine │ │
│ │ Detector │ │ (tokenise+score) │ │
│ └─────────────┘ └────────┬─────────┘ │
│ │ │
│ ┌─────────▼─────────┐ │
│ │ SOAS │ │
│ │ concept vocab │ │
│ │ + saliency store │ │
│ └─────────┬─────────┘ │
│ │ │
│ ┌───────────────────┤ │
│ │ │ │
│ ┌───────▼──────┐ ┌────────▼─────────┐ │
│ │ Recollection │ │ Memory Writer │ │
│ │ Engine │ │ (cloud LLM + │ │
│ │ (read IN → │ │ NL → IN parser) │ │
│ │ inject │ └────────┬─────────┘ │
│ │ <recollec- │ │ │
│ │ tion> block)│ ┌────────▼─────────┐ │
│ └──────────────┘ │ Conflict │ │
│ │ │ Resolver │ │
│ │ └────────┬─────────┘ │
│ └──────────┬───────┘ │
│ │ │
│ ┌▼──────────────────┐ │
│ │ IN table │ │
│ │ acyclic concept │ │
│ │ graph │ │
│ └───────────────────┘ │
└───────────────────────────────────────┬──┘
Ollama (host)
▼ (write path only)
Cloud LLM API
```
---
## Data Model
### models — LLM provider configuration
Festinger uses a cloud LLM for two purposes: the saliency-triggered write path and the nightly resolution job. Each purpose can use a different model.
| Column | Type | Notes |
|--------|------|-------|
| id | SERIAL PK | |
| provider | VARCHAR | `claude` or `openai` |
| model_name | VARCHAR | e.g. `claude-opus-4-6`, `gpt-4o` |
| api_key | VARCHAR | stored encrypted at rest |
| created_at | TIMESTAMPTZ | |
### config — runtime configuration
Key-value store for settings changeable without redeployment.
| Key | Default | Purpose |
|-----|---------|---------|
| `write_model_id` | — | FK into models; used for saliency-triggered write path |
| `resolve_model_id` | — | FK into models; used for nightly resolution job |
| `saliency_read_threshold` | `0.5` | Minimum saliency to trigger recollection lookup |
| `saliency_write_threshold` | `1.2` | Minimum saliency to trigger cloud LLM write |
| `recollection_confidence_floor` | `0.6` | Minimum URD edge confidence to include in recollection |
| `recollection_recency_days` | `90` | URD edges older than this are excluded |
| `resolution_schedule` | `0 2 * * *` | Cron expression for nightly resolution job |
### SOAS — concept vocabulary and saliency
One row per token. No minimum token length — all tokens are indexed; frequency and novelty determine whether they surface. Common English words are pre-seeded at saliency 0 via a dictionary corpus. Dimensions are themselves SOAS concepts — no separate table needed.
| Column | Type | Notes |
|--------|------|-------|
| id | INT PK | auto-increment |
| token | VARCHAR | unique, lowercase normalised |
| encounter_count | INT | raw count across all intercepted prompts |
| last_seen | TIMESTAMP | for recency tracking |
| saliency | FLOAT | log-scaled encounter saliency; 0 = common English |
| novelty | FLOAT | domain-specificity score; set to 1.0 when first confirmed by cloud LLM write path, 0 for pre-seeded dictionary words |
**Saliency vs novelty** are distinct scores serving different purposes:
- `saliency` measures how frequently a concept appears — it drives read-threshold triggering
- `novelty` measures how domain-specific a concept is — a system name that rarely appears but is clearly project-specific should still be treated as important
### URD table — the acyclic concept graph
Named `urd` (the SQL reserved word `IN` cannot be used as a table name). All three FK columns reference SOAS, so dimensions are concepts in the same vocabulary — new dimensions emerge from the same token space without schema changes.
| Column | Type | Notes |
|--------|------|-------|
| id | INT FK | references SOAS — the concept being placed |
| parent_id | INT FK | references SOAS — the containing concept |
| dim_id | INT FK | references SOAS — which dimension this edge belongs to |
| is_isa | BOOLEAN | true = ISA (type/classification); false = ISPART (membership/containment). Outside the index — does not affect collision detection but drives conflict resolution semantics and recollection rendering |
| confidence | FLOAT | reliability of this edge, 0.01.0; set by cloud LLM at write time, updated by conflict resolver |
| last_confirmed | TIMESTAMPTZ | when was this edge last corroborated by an intercepted prompt; used for recency decay in recollection injection |
| source | VARCHAR | `cloud_llm` (saliency write path), `inferred` (cue pattern), `festinger` (resolution job), `gutask` (gutask iknowthat) |
**Index structure:**
- **PK**: `(id, parent_id, dim_id)` — full triple, prevents duplicate edges
- **Unique index**: `(id, dim_id)` — one parent per concept per dimension; this is the acyclicity and contradiction-resistance mechanism
- **Root nodes**: rows where `id = parent_id = dim_id` — the named root of a dimension tree; the one allowed self-reference
**Single relation:** IN — "this concept is semantically contained within that concept, within this dimension." ISA and ISPART are not separate relations or tables; they are the same IN relation, with `is_isa` annotating which flavour the edge represents.
### In-Memory Cache Layer
The proxy maintains three in-memory structures populated at startup from Postgres. All read operations hit these structures only — zero network on the hot path. Writes are write-through: in-memory first, then Postgres async (saliency updates) or sync (URD inserts).
```python
# SOAS — primary lookup by token string (mirrors UNIQUE index on token)
soas_by_token: dict[str, SoasRow]
# SOAS — reverse lookup by id (for pre-joining URD results)
soas_by_id: dict[int, str]
# URD — recollection reads: concept_id → list of edges (tokens pre-joined)
urd_by_concept: dict[int, list[UrdEdge]]
# URD — collision detection: (concept_id, dim_id) → edge
# Mirrors the Postgres UNIQUE index on (id, dim_id) exactly
urd_by_concept_dim: dict[tuple[int, int], UrdEdge]
# Resolution queue — concepts with pending conflicts (for ? marker in recollections)
pending_conflicts: set[int]
```
**New SOAS token flow** — ids always originate from Postgres:
```
token not in soas_by_token
→ INSERT into Postgres SOAS → Postgres returns auto-increment id
→ add to soas_by_token[token] and soas_by_id[id]
→ proceed with that id
```
**URD insert flow** — collision detected in-memory, Postgres is the safety net:
```
key = (concept_id, dim_id)
if key in urd_by_concept_dim:
→ collision detected in-memory
→ classify type (is_isa flags), route to resolution queue
→ return — no Postgres write attempted
else:
→ INSERT into Postgres URD
→ on success: update urd_by_concept[concept_id] and urd_by_concept_dim[key]
→ on UniqueViolation (race condition): reload row, route to resolution queue
```
**Saliency update flow** — batched to avoid per-token Postgres writes:
```
every token encounter → update soas_by_token[token].encounter_count in-memory
every 30 seconds → flush encounter count deltas to Postgres in one batch UPDATE
```
**Cache reload after nightly job** — nightly job POSTs to `/reload` endpoint on proxy:
```
proxy receives /reload
→ re-SELECT all URD rows from Postgres with pre-joined tokens
→ rebuild urd_by_concept and urd_by_concept_dim
→ rebuild pending_conflicts from resolution queue
→ SOAS dict unchanged (nightly job does not modify SOAS)
```
This separation means tests can inject mock data directly into the dicts without touching Postgres, enabling full unit testing of collision detection, recollection rendering, and queue routing.
---
### Canonical Recollection Query
The recollection engine executes this query for each salient concept found in an intercepted prompt. No chain traversal — depth is always 1. The result is a flat enumeration of all edges where the concept is the subject, across all dimensions.
```sql
SELECT
u.id,
u.parent_id,
u.dim_id,
u.is_isa,
p.token AS parent_token,
d.token AS dim_token
FROM urd u
INNER JOIN soas p ON p.id = u.parent_id
INNER JOIN soas d ON d.id = u.dim_id
WHERE u.id = $1
AND u.confidence >= $2
AND u.last_confirmed >= $3
ORDER BY u.id, u.dim_id DESC;
```
`$1` = SOAS id of the concept, `$2` = confidence floor (from config), `$3` = recency cutoff (from config).
**Injection position:** the recollection block is prepended to the content of the existing system message if one is present. If no system message exists in the messages array, a new `{"role": "system"}` message containing only the recollection block is inserted at position 0. The system message is the highest-attention position in most instruction-tuned models — it is where grounding facts anchor most reliably.
**Rendering:** iterate rows; for each row emit `[dim_token] parent_token`. If the concept also has a pending entry in the resolution queue, append `?` to that dimension token. Group by concept when multiple concepts are queried in a single block.
**Zero-hit rendering:** if a concept is above the read threshold but has no URD entries, it is a salient domain-specific term the world model has not yet encountered. Instead of silently omitting it, the recollection block emits an explicit prompt to the agent:
```
? gnommoweb: no recollection. If this is a typo, ignore.
If you know what it is, store it before proceeding:
gutask iknowthat 'gnommoweb -isa <parent> in context of <dimension>'
gutask iknowthat 'gnommoweb -ispart <system> in context of <dimension>'
```
This turns every unknown salient concept into an active instruction. The agent either confirms it is a typo, asks for clarification from a human or peer agent, or fills the gap itself via `gutask iknowthat`. The world model grows organically through use.
**Full example recollection block:**
```
<recollection>
gnommoweb: [glitch_university] repo [geography] ramanujan [type?] service
dobby: [agent_pool] worker [tech] python
? ramanujan: no recollection. If you know what it is, store it before proceeding:
gutask iknowthat 'ramanujan -isa <parent> in context of <dimension>'
gutask iknowthat 'ramanujan -ispart <system> in context of <dimension>'
</recollection>
```
`[type?]` signals a pending conflict on that dimension — the world model is not wrong, resolution is in progress.
### gutask iknowthat — Manual Write Path
`gutask iknowthat` is the highest-confidence write path into URD. It bypasses the saliency threshold and the cloud LLM entirely.
**Location:** `/Users/jenstandstad/Projects/gutasktool` (sibling of Agent0 repo). The command POSTs to Festinger's `/iknowthat` HTTP endpoint — gutasktool does not connect to Postgres directly.
**Syntax:**
```sh
gutask iknowthat 'gnommoweb -isa repo in context of glitch_university'
gutask iknowthat 'gnommoweb -ispart glitch_university in context of membership'
```
- `-isa` sets `is_isa=true`; `-ispart` sets `is_isa=false`
- `in context of <dimension>` specifies the dimension token; defaults to `type` for `-isa`, `membership` for `-ispart`
- All tokens run through the standard tokeniser (compound token rule applies)
- Inserts into SOAS if any token is new
- Inserts into URD with `confidence=1.0`, `source=gutask`
- On collision: enters resolution queue with priority flag; `gutask`-source conflicts are reviewed first at `/conflicts`
**Festinger `/iknowthat` endpoint:** accepts `POST` with JSON body `{fact: string}`. Parses, tokenises, writes to SOAS/URD, returns the inserted or conflicted result. This decouples gutasktool from the Festinger schema.
This command is the agent's direct interface to the world model. When the recollection block surfaces a zero-hit concept, `gutask iknowthat` is the prescribed response.
---
## The IN Relation and ISA/ISPART
### Why a single operator
ISA and ISPART are both instances of a more general semantic containment relation. The dimension carries the semantic weight:
- `gnommoweb IN repo` (dimension: `type`) → gnommoweb ISA repo
- `gnommoweb IN Glitch-University` (dimension: `membership`) → gnommoweb ISPART Glitch-University
- `State IN Country` (dimension: `type`) → State ISA a country-level granularity
The `type` dimension IS the ISA relation. Every other dimension IS an ISPART relation scoped to that domain. One table, one index, one operator.
### Why this resolves the bleed
The classic bleed case: Michigan ISA State, Michigan ISPART USA, State ISPART Country.
With the single IN operator and dimensions:
```
Michigan IN State (dimension: type) is_isa: true
Michigan IN USA (dimension: geography) is_isa: false
State IN Country (dimension: type) is_isa: true
USA IN Country (dimension: type) is_isa: true
```
Two coherent chains, no collision, no ambiguity. "State ISPART Country" (class-level generalisation) becomes "State IN Country in the type dimension" — a perfectly valid ISA statement: State is a kind of country-level subdivision.
---
## Collision Semantics — The `is_isa` Flag
The unique index on `(concept_id, dimension_id)` fires when a concept already has a parent in a given dimension. The `is_isa` flag on both the existing and incoming rows determines what the collision means:
| Existing | Incoming | Interpretation | Action |
|----------|----------|---------------|--------|
| ISA | ISA | Dimension too coarse — both facts simultaneously true about the concept's nature | Trigger **dimension decomposition** |
| ISPART | ISPART | Factual contradiction — thing can only be in one place per dimension | Trigger **arbitration** (which is correct?) |
| ISA | ISPART | Dimension misclassification — these should have been in different dimensions | Flag as misclassification, suggest correct dimension |
---
## Dimension Decomposition — Dimensions as Evolving Vocabulary
When an ISA+ISA collision occurs, the dimension is too coarse to hold both simultaneously-true facts. Example:
- Existing: `gnommoweb IN container` (dimension: `type`, is_isa: true)
- Incoming: `gnommoweb IN repo` (dimension: `type`, is_isa: true)
Both are true. The `type` dimension cannot hold them both. The conflict resolver sends this prompt to the cloud LLM:
> "gnommoweb is already `container` in dimension `type`. New fact also places gnommoweb as `repo` in `type`. If both are simultaneously true, propose two more specific dimension names to replace `type` — one where gnommoweb as `container` remains valid, one where gnommoweb as `repo` is valid. Return JSON: `{"existing_dimension": "...", "new_dimension": "..."}`. Choose from this taxonomy where possible: [...]. Create new names only if nothing fits."
The LLM might return: `{"existing_dimension": "deployment-type", "new_dimension": "artifact-type"}`.
The system then:
1. Inserts `deployment-type` and `artifact-type` into SOAS (if not present)
2. Creates root nodes for each new dimension
3. Inserts the new fact under `artifact-type`
4. Leaves the existing `type` facts untouched — no migration
Dimensions are SOAS concepts. New dimensions emerge from the same token vocabulary. The graph grows its own taxonomy under pressure from real contradictions, starting coarse and decomposing on demand.
---
## The Two Memory Operations
### Tokenisation Rules
All text — prompts, system messages, agent outputs — passes through the same tokeniser before any saliency or relationship work is done.
**Token extraction:**
1. Split on whitespace and punctuation boundaries
2. **Compound token rule**: scan for runs of consecutive tokens where each begins with a capital letter. Merge the run into a single token, joined with underscores, then lowercase. This canonicalises proper nouns and multi-word concepts into single SOAS entries.
- `Glitch University``glitch_university`
- `Agent Zero``agent_zero`
- `New York City``new_york_city`
- A lowercase or short token breaks the run: `the Glitch University``the` breaks the run, `Glitch University``glitch_university`
3. Lowercase all tokens
4. Keep tokens with **5 or more characters** (strictly >4); discard shorter tokens unless they are part of a matched relationship cue pattern (see below)
5. Strip leading/trailing punctuation from each token
**Example:**
`"gnommoweb is a repo of Glitch University"`
→ tokens: `gnommoweb`, `repo`, `glitch_university`
→ relationship extracted: `gnommoweb IN repo IN dim:glitch_university` (is_isa=true, via "is a … of" pattern)
---
### Relationship Cue Patterns
Certain keyword patterns in intercepted text are direct cues to the memory layer that a semantic relationship is being expressed. The middleman scans every prompt for these patterns. When matched, the relationship is extracted and queued for insertion into the IN table — bypassing the saliency threshold, since the relationship has been made explicit.
Agents and humans interacting with Agent0 should be aware that using these patterns causes the middleman to build or update the world model.
**ISA patterns** (`is_isa = true`) — the subject is a type or instance of the object:
| Pattern | Example |
|---------|---------|
| `{X} is a {Y}` | gnommoweb is a repo |
| `{X} is an {Y}` | gnommoweb is an API |
| `{X} ISA {Y}` | gnommoweb ISA repo |
| `{X} is a kind of {Y}` | State is a kind of region |
| `{X} is a type of {Y}` | gnommoweb is a type of service |
| `{X} is an instance of {Y}` | dobby is an instance of agent |
| `{X} kind of {Y}` | gnommoweb kind of repo |
| `{X} type of {Y}` | gnommoweb type of service |
| `{X} instance of {Y}` | dobby instance of agent |
**ISPART patterns** (`is_isa = false`) — the subject is a member, part, or component of the object:
| Pattern | Example |
|---------|---------|
| `{X} is part of {Y}` | gnommoweb is part of Glitch University |
| `{X} ISPART {Y}` | gnommoweb ISPART glitch_university |
| `{X} part of {Y}` | gnommoweb part of Agent0 |
| `{X} belongs to {Y}` | gnommoweb belongs to Glitch University |
| `{X} is owned by {Y}` | gnommoweb is owned by jenstandstad |
| `{X} owned by {Y}` | gnommoweb owned by jenstandstad |
| `{X} member of {Y}` | dobby member of agent_pool |
| `{X} is a member of {Y}` | dobby is a member of agent_pool |
| `{X} runs on {Y}` | gnommoweb runs on Docker |
| `{X} hosted by {Y}` | gnommoweb hosted by ramanujan |
| `{X} deployed on {Y}` | gnommoweb deployed on Docker |
| `{X} contained in {Y}` | gnommoweb contained in agent0_stack |
**The `of {Z}` dimension modifier:**
When an ISA pattern is followed by `of {Z}`, the named entity `{Z}` becomes the dimension for the extracted edge. This allows natural language to directly specify the context in which a classification holds:
```
"gnommoweb is a repo of Glitch University"
→ gnommoweb IN repo IN dim:glitch_university (is_isa=true)
"Michigan is a state of USA"
→ michigan IN state IN dim:usa (is_isa=true)
```
Without the `of {Z}` modifier, the dimension defaults to `type` for ISA patterns and the most appropriate seed dimension for ISPART patterns (inferred by the cloud LLM during the write step, or defaulting to `membership`).
**Cue-triggered writes bypass the saliency threshold.** Explicit relationship cues are treated as high-confidence signals regardless of how many times a concept has previously appeared. The extracted triple goes directly into the write queue with `source: inferred` and a confidence score assigned by the pattern type (exact keyword cues score higher than positional inference).
---
### Writing — cloud-triggered, async
When a concept's saliency crosses the **write threshold**, Middleman queues it for background processing (cloud LLM calls must not block the prompt response path):
1. Call cloud LLM: "What is `{concept}`?" with the current dimension taxonomy as a closed list and a structured output prompt requesting `(concept, parent, dimension, is_isa)` triples
2. Parse response into IN table INSERT statements
3. Attempt inserts; route constraint violations to the conflict resolver
4. Update `novelty` score in SOAS for the concept
### Reading — spontaneous prompt enrichment
On every intercepted prompt:
1. Tokenise the full prompt string, extract tokens >4 chars, normalise
2. Look up each token in SOAS, update encounter counts and saliency
3. For tokens above the **read threshold**, query the IN table for all edges involving the concept
4. Traverse each chain upward (configurable max depth)
5. Format as a `<recollection>` block and prepend to the prompt before forwarding to Ollama
Example output:
```
<recollection>
gnommoweb: [type] repo → software-artifact
[membership] Glitch-University → Agent0-infrastructure
[tech] FastAPI → Python
glitch.university: [type] platform → web-service
[membership] Agent0 → Glitch-Hunter-project
</recollection>
```
Only edges above the confidence threshold and within the recency window are included. The agent does not search for these — they appear spontaneously.
**Read and write thresholds are separate and independently tunable.** Reading (DB lookup) is cheap; writing (cloud LLM call) is expensive. The write threshold should be meaningfully higher than the read threshold.
---
## Conflict Resolution — The Nightly Therapy Model
The IN table is rigorous and autistic: it cannot hold contradictions. Any collision is immediately routed to the **resolution queue** — a separate table — where it waits for the nightly resolution job to process it. During this period the world model stands unchanged; the old fact continues to be served in recollections, marked with a `?` to signal pending dissonance.
### On collision (immediate, synchronous)
1. Classify the collision type by reading `is_isa` on both rows: ISA+ISA, ISPART+ISPART, or misclassification
2. Insert the rejected fact into the **resolution queue** with full context: existing edge, incoming edge, dimension, collision type, timestamp
3. Return normally — the proxy response is not blocked
### During the day (recollection engine)
Concepts with entries in the resolution queue are rendered with a `?` marker:
```
<recollection>
gnommoweb: [type] container
[type?] repo — pending resolution
</recollection>
```
The agent sees that a fact is contested. The world model is not wrong — it is incomplete. The `?` marker disappears after the nightly job resolves or dismisses the conflict.
### Nightly resolution job
Runs as a background thread inside the Festinger proxy process on the schedule set in the `config` table (`resolution_schedule`). Can also be triggered manually via `POST /resolve/run` — a corresponding button is exposed in the Festinger admin UI. Uses the model configured in `resolve_model_id`.
For each item in the queue, calls the configured LLM with both facts and the collision type, receives a structured decision, and applies it:
**ISA+ISA collision (dimension too coarse):**
- LLM outcome A — **decompose**: suggest two new dimension names. System creates new SOAS entries and root nodes, inserts both facts into their respective new dimensions, marks queue item resolved.
- LLM outcome B — **dismiss**: the incoming fact is noise or wrong. Queue item marked dismissed. World model stands.
**ISPART+ISPART collision (factual contradiction):**
- LLM outcome A — **update**: the incoming fact is more current. System removes old IN edge, inserts new one, marks queue item resolved.
- LLM outcome B — **dismiss**: the existing fact is still correct. Queue item marked dismissed.
**Misclassification (ISA+ISPART in same dimension):**
- LLM suggests the correct dimension for the incoming fact. System inserts it in the corrected dimension (no collision), marks queue item resolved.
### Resolution queue schema
| Column | Type | Notes |
|--------|------|-------|
| id | INT PK | auto-increment |
| concept_id | INT FK | references SOAS |
| existing_parent_id | INT FK | the parent currently in the IN table |
| incoming_parent_id | INT FK | the rejected parent |
| dimension_id | INT FK | the dimension where the collision occurred |
| collision_type | ENUM | `isa_isa`, `ispart_ispart`, `misclassification` |
| status | ENUM | `pending`, `resolved`, `dismissed` |
| resolution | TEXT | JSON record of what the nightly job decided and did |
| created_at | TIMESTAMP | when the collision occurred |
| resolved_at | TIMESTAMP | when the nightly job processed it |
### Properties of this model
- **IN table is always consistent** — agents never receive contradictory recollections from confirmed facts
- **Resolution is deliberate, not reactive** — the nightly job processes dissonance calmly, with full context, not under prompt-response time pressure
- **Dismissal is a first-class outcome** — not every collision is a real problem; the LLM can decide the world model is correct and the incoming fact was noise
- **The `?` marker is the fuzziness adjunct** — it surfaces uncertainty to agents without compromising the graph's integrity
- **`/conflicts` endpoint** — exposes the full queue (pending and recently resolved) for human inspection and override
---
## Graph Properties
- **Acyclic**: the unique index on `(concept_id, dimension_id)` enforces single-parent per concept per dimension, making each dimension a forest of trees — structurally acyclic without any runtime check
- **Shared**: one graph for all agents — recollections represent shared facts about the project, not per-agent beliefs. Agents already have per-agent memory in Agent Zero's own FAISS layer
- **Contradiction-resistant**: the index makes it structurally impossible to store conflicting facts in the same dimension. Contradictions surface as insert failures, not silent overwrites
- **Self-organising**: dimension taxonomy starts with a small seed list and decomposes on demand. No human needs to pre-define the full ontology
---
## Saliency Decay
Two decay mechanisms, operating independently:
- **SOAS recency** (`last_seen` timestamp): concepts not seen for a long time are deprioritised for recollection injection but not deleted. If gnommoweb reappears in a prompt, `last_seen` updates and its recollections resurface immediately
- **IN edge recency** (`last_confirmed` timestamp): edges not corroborated by recent prompts are given lower weight in recollection injection, making the recollection block favour currently-relevant facts
Decay does not delete knowledge — it adjusts injection priority. The graph remains intact.
---
## Dimension Taxonomy — Seed List
A small, orthogonal set of initial dimensions. Each dimension answers a specific question about a concept. New dimensions emerge through decomposition; this list is the starting skeleton.
| Dimension | Question | is_isa |
|-----------|----------|--------|
| `type` | What kind of thing is this? | true |
| `membership` | What system or project does this belong to? | false |
| `runs-on` | What infrastructure hosts or executes this? | false |
| `tech` | What technology stack is it built with? | false |
| `owned-by` | Who is responsible for this? | false |
| `geography` | Where is this spatially or organisationally located? | false |
Root nodes for each dimension are seeded at bootstrap. The `type` dimension is expected to be the first to decompose as domain-specific concepts accumulate.
---
## Components
| Component | Description | State |
|-----------|-------------|-------|
| Proxy core | FastAPI Ollama-compatible HTTP proxy | ✅ built |
| Loop detector | Session-scoped repeat detection + mitigations | ✅ built |
| Config system | Hot-reloading YAML config | ✅ built |
| SOAS store | Concept vocabulary + saliency DB table | ✅ built |
| IN table store | Acyclic concept graph with correct indexes | ✅ built |
| Dictionary bootstrap | Pre-seed SOAS with common English at saliency 0 | ✅ built |
| Dimension bootstrap | Seed root nodes for initial dimension taxonomy | ✅ built |
| Saliency engine | Tokenise prompt, score tokens, update SOAS counts | ✅ built |
| Recollection engine | Query IN table, traverse chains, format + inject block | ✅ built |
| Memory writer | Write-threshold trigger → async cloud LLM → NL→IN parse → insert | ✅ built |
| Conflict resolver | On collision: classify type, insert into resolution queue immediately | ✅ built |
| Resolution queue | Pending/resolved/dismissed conflicts with full context | ✅ built |
| Nightly resolution job | Drain queue via cloud LLM; apply decompose/update/dismiss decisions | ✅ built |
| `/conflicts` endpoint | Expose queue (pending + recent) for human inspection and override | ✅ built |
| Persistence | Postgres from day one; English dictionary pre-loaded into SOAS at init | ✅ built |
---
## Task Breakdown
### Phase 1 — Foundation (complete)
- [x] **T01** Proxy core: FastAPI Ollama-compatible server forwarding `/api/chat` and `/api/generate`
- [x] **T02** Loop detector: session-scoped exact-match repetition detection
- [x] **T03** Mitigations: temperature boost, forbidden action injection, history truncation, circuit breaker
- [x] **T04** Hot-reload YAML config
- [x] **T05** Docker container + docker-compose service entry
### Phase 2 — Persistence Layer
- [x] **T06** Postgres service: add `postgres` container to docker-compose; connection config in `config.yaml`
- [x] **T07** `models` table: `id` SERIAL PK, `provider` VARCHAR, `model_name` VARCHAR, `api_key` VARCHAR, `created_at` TIMESTAMPTZ
- [x] **T08** `config` table: `key` VARCHAR PK, `value` TEXT, `updated_at` TIMESTAMPTZ; seed with default values for all config keys
- [x] **T09** SOAS table: `id` SERIAL PK, `token` VARCHAR UNIQUE, `encounter_count` INT default 0, `last_seen` TIMESTAMPTZ, `saliency` FLOAT default 0, `novelty` FLOAT default 0. All tokens lowercase. Unique index on `token`.
- [x] **T10** URD table: `id` INT FK → soas, `parent_id` INT FK → soas, `dim_id` INT FK → soas, `is_isa` BOOLEAN, `confidence` FLOAT, `last_confirmed` TIMESTAMPTZ, `source` VARCHAR. PK `(id, parent_id, dim_id)`. Unique index `(id, dim_id)`.
- [x] **T11** Resolution queue table: `id` SERIAL PK, `concept_id` INT FK → soas, `existing_parent_id` INT FK → soas, `incoming_parent_id` INT FK → soas, `dim_id` INT FK → soas, `collision_type` VARCHAR, `status` VARCHAR default 'pending', `resolution` JSONB, `created_at` TIMESTAMPTZ, `resolved_at` TIMESTAMPTZ
- [x] **T12** English dictionary bootstrap: bulk-load word list into SOAS with `saliency=0`, `novelty=0`, `encounter_count=0` on container init. Skip existing tokens.
- [x] **T13** Dimension bootstrap: insert SOAS entries and self-referential URD root nodes (`id = parent_id = dim_id`) for the 6 seed dimensions
### Phase 3 — Saliency Engine and Prompt Parsing
- [x] **T14** Tokeniser: split on whitespace/punctuation; apply compound token rule (consecutive capitalised tokens → single underscore-joined lowercase token); no minimum length; strip punctuation; lowercase all
- [x] **T15** Relationship cue scanner: regex/pattern scan for ISA and ISPART cue patterns; extract `(subject, parent, dimension_modifier, is_isa)` triples; handle `of {Z}` dimension modifier
- [x] **T16** SOAS lookup + update: increment `encounter_count`, update `last_seen`, recalculate `saliency` (log scale) for all extracted tokens; read thresholds from `config` table
- [x] **T17** Threshold evaluation: read `saliency_read_threshold` and `saliency_write_threshold` from config; cue-extracted triples bypass write-threshold entirely
### Phase 4 — Recollection Engine (Read Path)
- [x] **T16** URD query: for each above-read-threshold token, execute the canonical recollection query; filter by `confidence` floor and `last_confirmed` recency window
- [x] **T17** Recollection formatter — hit path: enumerate query rows; group by concept; render each edge as `[dim_token] parent_token`; append `?` for edges with a pending resolution queue entry
- [x] **T18** Recollection formatter — zero-hit path: for salient concepts with no URD rows, emit the `? concept: no recollection` prompt block including the `gutask iknowthat` usage hint
- [x] **T19** Prompt injection: prepend recollection block before forwarding to Ollama
- [x] **T20** Recollection config: max concepts per block, confidence floor, recency window, injection position
### Phase 5 — Memory Writer (Write Path)
- [x] **T21** `POST /iknowthat` endpoint: accept `{fact: string}`, parse `-isa`/`-ispart` flags and `in context of` clause, run tokeniser, upsert SOAS, insert into URD with `confidence=1.0, source=gutask`; route collisions to resolution queue with priority flag
- [x] **T22** `gutask iknowthat` command (in `/Users/jenstandstad/Projects/gutasktool`): parse fact string, POST to Festinger `/iknowthat`, surface result to agent
- [x] **T23** Write queue: async background queue for concepts crossing the write threshold; cue-extracted triples enter directly regardless of threshold
- [x] **T24** LLM client: support `claude` and `openai` providers; load provider/model/key from `models` table via `write_model_id` config key; structured prompt requesting `(concept, parent, dimension, is_isa, confidence)` triples as JSON
- [x] **T25** NL→IN parser: validate triples against SOAS and known dimensions; create new SOAS entries for unknown tokens; apply compound token rule
- [x] **T26** URD insert pipeline: check `urd_by_concept_dim` in-memory first; on miss attempt Postgres insert; on hit or UniqueViolation route to conflict resolver; set `source` field per write path
### Phase 6 — Conflict Resolution
- [x] **T25** Collision handler: on unique constraint violation, classify type via `is_isa` flags, insert immediately into resolution queue
- [x] **T26** Recollection engine update: check resolution queue for pending items per concept; render pending edges with `[dim?]` marker
- [x] **T27** Nightly resolution job: background thread, schedule from `config` table (`resolution_schedule`); for each `pending` queue item, call LLM configured in `resolve_model_id` with both facts and collision type, receive JSON decision (`decompose` / `update` / `dismiss`)
- [x] **T28** Resolution applicator — decompose: create new dimension SOAS entries and root nodes; insert both facts in respective new dimensions; mark queue item resolved
- [x] **T29** Resolution applicator — update: remove old IN edge, insert new fact, mark queue item resolved
- [x] **T30** Resolution applicator — dismiss: mark queue item dismissed; world model unchanged; the `[dim?]` marker disappears from recollections
- [x] **T31** `/conflicts` endpoint: list pending and recently resolved/dismissed items with full context; support human override (force-dismiss, force-resolve)
- [x] **T32** `POST /resolve/run` endpoint: manually trigger the nightly resolution job outside of its schedule
- [x] **T33** Admin UI: minimal HTML page served by Festinger at `/admin`; shows pending conflicts count, last resolution run timestamp, and a "Run resolution now" button wired to `POST /resolve/run`
### Phase 7 — Hardening
- [ ] **T32** Latency guard: tokenisation + saliency lookup + recollection query must not add >50ms to prompt round-trip; use connection pooling
- [ ] **T33** Write path fully async: all cloud LLM calls, IN inserts, and queue operations run in background workers; proxy response never waits
- [ ] **T34** Integration test: plain prompt → compound token extraction → saliency update → recollection injection → Ollama round trip
- [ ] **T35** Integration test: cue pattern in prompt ("gnommoweb is a repo of Glitch University") → extracted triple bypasses threshold → IN insert → recollection on next prompt
- [ ] **T36** Integration test: write threshold → cloud LLM write → IN insert → recollection on next prompt
- [ ] **T37** Integration test: ISA+ISA collision → immediate queue insert → `[type?]` marker in recollection → nightly job → decompose → clean recollection across two dimensions
- [ ] **T38** Integration test: ISPART+ISPART collision → queue insert → nightly job → dismiss → world model unchanged, marker gone
---
## Resolved Design Decisions
| Question | Decision |
|----------|----------|
| Single relation or ISA+ISPART tables? | Single IN operator; `is_isa` boolean flag outside the index annotates edge type |
| ISA vs ISPART bleed | Resolved by dimension: `type` dimension = ISA; all other dimensions = ISPART |
| Shared vs per-agent graph | Shared — recollections are project-wide facts; per-agent memory remains in Agent Zero's FAISS layer |
| Saliency decay | Dual mechanism: SOAS `last_seen` for concept recency; IN `last_confirmed` for edge recency. Decay adjusts injection priority, never deletes knowledge |
| Recollection depth | Depth = 1, no chain traversal. Single flat query against URD for direct edges of each salient concept. Property-loop problem is dissolved by design — self-referential chains cannot arise without traversal. |
| Write path blocking | Fully async — cloud LLM calls queued in background, never block proxy response |
| Dimension taxonomy | Seed list of 6 orthogonal dimensions; decomposes on demand via nightly resolution job when ISA+ISA collision is queued |
| Collision semantics | ISA+ISA → dimension too coarse → decompose. ISPART+ISPART → factual contradiction → arbitrate. ISA+ISPART → misclassification → flag |
| Contradiction resistance | Structural — unique index on `(concept_id, dimension_id)` makes conflicting facts physically uninsertable in the same dimension |
| New dimensions | Emerge as SOAS tokens; no schema change needed; root nodes created at decomposition time |
| Conflict resolution timing | Immediate queue insert on collision; nightly job drains the queue via cloud LLM; outcomes are decompose / update / dismiss |
| Database | Postgres from day one — no SQLite POC. English dictionary bulk-loaded into SOAS at container init. |
| Tokenisation | Tokens ≥5 chars, lowercased. Consecutive capitalised tokens merged into single underscore-joined token (`Glitch University``glitch_university`). |
| Relationship cue parsing | ISA and ISPART keyword patterns in intercepted text trigger direct triple extraction, bypassing saliency threshold. `of {Z}` modifier sets the dimension. |
| Cue-triggered writes | Explicit cue patterns are high-confidence signals; extracted triples go straight to the write queue with `source: inferred`, no threshold gate. |
| Zero-hit recollection | Salient concepts with no URD entries emit a `? concept: no recollection` prompt block instructing the agent to clarify or store the fact via `gutask iknowthat`. |
| Manual write path | `gutask iknowthat 'X -isa/-ispart Y in context of Z'` inserts directly into URD with `confidence=1.0, source=manual`, bypassing saliency threshold and cloud LLM. |
| In-memory layer | SOAS and URD cached in Python dicts at startup. Reads are zero-network. IDs always originate from Postgres. Collision detection uses `dict[(concept_id, dim_id)]` mirroring the Postgres unique index. Postgres is the safety net for race conditions. |
| Token length | No minimum — all tokens indexed. Frequency and novelty determine whether they surface. Optimize later if needed. |
| Nightly job execution | Background thread inside proxy process. Triggered by cron schedule (config table) or manually via `POST /resolve/run` + admin UI button. |
| Recollection injection | Prepended to existing system message content. If no system message exists, a new one is inserted at position 0. System message position provides the strongest grounding anchor for instruction-tuned models. |
| LLM configuration | `models` table (provider, model_name, api_key). `config` table keys `write_model_id` and `resolve_model_id` select which model each purpose uses. Supports `claude` and `openai` providers. |
| Source values | `cloud_llm` (saliency write path), `inferred` (cue pattern extraction), `festinger` (nightly resolution job), `gutask` (gutask iknowthat command). |
| gutask iknowthat interface | gutasktool at `/Users/jenstandstad/Projects/gutasktool`. Command POSTs to Festinger's `/iknowthat` endpoint — no direct Postgres access from gutasktool. |
---
## Test Cases
The in-memory cache layer enables full unit testing without a live database. Tests pre-populate `soas_by_token`, `soas_by_id`, `urd_by_concept`, `urd_by_concept_dim`, and `pending_conflicts` directly, then exercise the logic under test and assert on the resulting state.
---
### Test A — Prompt includes a concept not in the cache
**Scenario:** An agent sends a prompt referencing `gnommoweb`. The concept exists nowhere in SOAS or URD.
**Setup:**
```python
soas_by_token = {} # empty — concept is completely unknown
urd_by_concept = {}
urd_by_concept_dim = {}
pending_conflicts = set()
```
**Input prompt:** `"Please update gnommoweb to use FastAPI instead"`
**Expected behaviour:**
1. Tokeniser extracts: `gnommoweb` (7 chars ✓), `fastapi` (7 chars ✓), `please` (6 chars but common English → saliency 0), `update` (6 chars, common → saliency 0), `instead` (7 chars, common → saliency 0)
2. `gnommoweb` and `fastapi` not in `soas_by_token` → both are new tokens. In a test with Postgres mocked, the mock returns `id=101` for `gnommoweb` and `id=102` for `fastapi`. Both added to SOAS dicts.
3. Saliency for both: `log(1)` — first encounter, below read threshold
4. URD lookup: skipped (below threshold) — no recollection block emitted for these concepts
**Edge case variant — above threshold:** pre-seed `soas_by_token["gnommoweb"]` with `encounter_count=50, saliency=0.9` (above read threshold) but keep `urd_by_concept` empty.
**Expected behaviour (variant):**
1. Saliency lookup: above read threshold ✓
2. URD lookup: `urd_by_concept.get(101, [])` → empty list
3. Zero-hit path triggered
4. Recollection block contains:
```
? gnommoweb: no recollection. If not a typo, store it before proceeding:
gutask iknowthat 'gnommoweb -isa <parent> in context of <dimension>'
gutask iknowthat 'gnommoweb -ispart <system> in context of <dimension>'
```
**Assertions:**
- `pending_conflicts` unchanged (no collision occurred)
- Resolution queue empty
- Recollection block contains `? gnommoweb`
- Prompt forwarded to Ollama with recollection block prepended
---
### Test B — Prompt includes "A ISA B" conflicting with existing world model
**Scenario:** The world model already holds `gnommoweb ISA repo` in the `type` dimension. A new prompt contains the explicit cue `"gnommoweb is a container"`, creating an ISA+ISA collision.
**Setup:**
```python
# SOAS
soas_by_token = {
"gnommoweb": SoasRow(id=101, saliency=1.2, novelty=1.0),
"repo": SoasRow(id=201, saliency=0.8, novelty=0.5),
"container": SoasRow(id=202, saliency=0.6, novelty=0.4),
"type": SoasRow(id=1, saliency=0.0, novelty=0.0), # seed dimension
}
soas_by_id = {101: "gnommoweb", 201: "repo", 202: "container", 1: "type"}
# URD — gnommoweb ISA repo IN type (existing confirmed fact)
existing_edge = UrdEdge(concept_id=101, parent_id=201, dim_id=1,
is_isa=True, confidence=0.9, source="cloud_llm")
urd_by_concept = {101: [existing_edge]}
urd_by_concept_dim = {(101, 1): existing_edge}
pending_conflicts = set()
```
**Input prompt:** `"gnommoweb is a container deployed on Docker"`
**Expected behaviour:**
1. Cue scanner matches `"is a"` pattern: extracts triple `(gnommoweb, container, type, is_isa=True)`
2. Collision check: `(101, 1)` found in `urd_by_concept_dim` → collision
3. Existing edge `is_isa=True`, incoming `is_isa=True` → ISA+ISA collision type
4. Resolution queue insert (mocked Postgres): `{concept_id=101, existing_parent_id=201, incoming_parent_id=202, dim_id=1, collision_type="isa_isa", status="pending"}`
5. `pending_conflicts.add(101)`
6. No URD modification — world model unchanged
**Recollection block for this prompt:**
```
<recollection>
gnommoweb: [type?] repo — conflict pending
</recollection>
```
**Assertions:**
- `urd_by_concept_dim[(101, 1)]` still points to the original `repo` edge (unchanged)
- `pending_conflicts == {101}`
- Resolution queue has exactly one entry with `collision_type="isa_isa"` and `status="pending"`
- No Postgres URD insert attempted
- Recollection renders `[type?]` marker for `gnommoweb`
---
### Test C — Nightly job: queue processing and dimension decomposition
**Scenario:** The resolution queue contains the ISA+ISA collision from Test B. The nightly job runs, calls the cloud LLM (mocked), receives a decomposition decision, and updates the world model.
**Setup:** state as at end of Test B, plus:
```python
resolution_queue = [
QueueEntry(id=1, concept_id=101, existing_parent_id=201, incoming_parent_id=202,
dim_id=1, collision_type="isa_isa", status="pending")
]
```
**Mocked cloud LLM response:**
```json
{
"decision": "decompose",
"existing_dimension": "artifact-type",
"new_dimension": "deployment-type",
"reasoning": "repo describes what gnommoweb is as a software artifact; container describes how it is deployed"
}
```
**Expected behaviour:**
1. Nightly job fetches pending queue entries from Postgres
2. For entry id=1, calls cloud LLM → receives decompose decision
3. **New dimensions:** insert `artifact-type` into Postgres SOAS → returns `id=401`; insert `deployment-type` → returns `id=402`. Add both to SOAS dicts.
4. **Root nodes:** insert `(401,401,401)` and `(402,402,402)` into Postgres URD (self-referential dimension roots)
5. **Migrate existing edge:** delete `(101, 201, 1)` from URD, insert `(101, 201, 401)` — gnommoweb ISA repo IN artifact-type
6. **Insert new edge:** insert `(101, 202, 402)` — gnommoweb ISA container IN deployment-type
7. **Update in-memory cache:**
- `soas_by_token["artifact-type"] = SoasRow(id=401, ...)`
- `soas_by_token["deployment-type"] = SoasRow(id=402, ...)`
- Remove `urd_by_concept_dim[(101, 1)]`, add `[(101, 401)]` and `[(101, 402)]`
- Update `urd_by_concept[101]` to two new edges
- `pending_conflicts.discard(101)`
8. Mark queue entry resolved with resolution JSON
9. Signal proxy `/reload` (or nightly job updates cache directly if in-process)
**Final recollection block for gnommoweb:**
```
<recollection>
gnommoweb: [artifact-type] repo [deployment-type] container
</recollection>
```
**Assertions:**
- `urd_by_concept_dim` contains `(101, 401)` and `(101, 402)`, not `(101, 1)`
- `urd_by_concept[101]` has exactly two edges
- `pending_conflicts` does not contain `101`
- `soas_by_token` contains `artifact-type` and `deployment-type` with ids from Postgres
- Resolution queue entry has `status="resolved"` and non-null `resolution` JSON
- No `[type?]` marker in subsequent recollection for gnommoweb