Files
agent0/plugins/festinger/PROJECT.md
T
2026-04-19 16:16:13 +02:00

49 KiB
Raw Blame History

Festinger — Agent0 Inference Middleware

Status: In progress — iterative specification
Owner: jenstandstad
Location: plugins/festinger/

Named after Leon Festinger (19191989), social psychologist who introduced the theory of cognitive dissonance in 1957. Festinger observed that minds — human or artificial — cannot comfortably hold contradictory beliefs simultaneously, and that the tension this creates drives resolution. This system is built on the same principle.


Purpose

Festinger is an Ollama-compatible HTTP proxy that sits between Agent0's agent-zero containers and the local Ollama inference endpoint. It solves two related problems that emerge with local inference:

  1. Reasoning loops — agents repeat the same output and cannot break out, even when the framework tells them to try something else.
  2. Stale and incoherent memory — Agent Zero's FAISS-based memory accumulates facts without contradiction detection, causing agents to act on outdated or conflicting beliefs.

Festinger addresses both at the inference layer, transparently, without modifying agent-zero internals. The memory layer it introduces is called Recollections — short, structured, non-contradicting facts injected spontaneously into every prompt as context enrichment. Agents do not search for recollections; they appear automatically.

Like its namesake's theory, Festinger treats contradiction not as an error to suppress but as a signal to act on.


Architecture

agent-zero containers
        │
        ▼
┌───────────────────────────────────────────┐
│            Festinger Proxy               │
│                                           │
│  ┌─────────────┐    ┌──────────────────┐  │
│  │ Loop        │    │ Saliency Engine  │  │
│  │ Detector    │    │ (tokenise+score) │  │
│  └─────────────┘    └────────┬─────────┘  │
│                              │            │
│                    ┌─────────▼─────────┐  │
│                    │  SOAS             │  │
│                    │  concept vocab    │  │
│                    │  + saliency store │  │
│                    └─────────┬─────────┘  │
│                              │            │
│          ┌───────────────────┤            │
│          │                   │            │
│  ┌───────▼──────┐  ┌────────▼─────────┐  │
│  │ Recollection │  │ Memory Writer    │  │
│  │ Engine       │  │ (cloud LLM +     │  │
│  │ (read IN  →  │  │  NL → IN parser) │  │
│  │  inject      │  └────────┬─────────┘  │
│  │  <recollec-  │           │            │
│  │  tion> block)│  ┌────────▼─────────┐  │
│  └──────────────┘  │ Conflict         │  │
│          │         │ Resolver         │  │
│          │         └────────┬─────────┘  │
│          └──────────┬───────┘            │
│                     │                    │
│                    ┌▼──────────────────┐  │
│                    │  IN table         │  │
│                    │  acyclic concept  │  │
│                    │  graph            │  │
│                    └───────────────────┘  │
└───────────────────────────────────────┬──┘
                                        │
                                        ▼
                                Ollama (host)
                                        │
                                        ▼ (write path only)
                                Cloud LLM API

Data Model

models — LLM provider configuration

Festinger uses a cloud LLM for two purposes: the saliency-triggered write path and the nightly resolution job. Each purpose can use a different model.

Column Type Notes
id SERIAL PK
provider VARCHAR claude or openai
model_name VARCHAR e.g. claude-opus-4-6, gpt-4o
api_key VARCHAR stored encrypted at rest
created_at TIMESTAMPTZ

config — runtime configuration

Key-value store for settings changeable without redeployment.

Key Default Purpose
write_model_id FK into models; used for saliency-triggered write path
resolve_model_id FK into models; used for nightly resolution job
saliency_read_threshold 0.5 Minimum saliency to trigger recollection lookup
saliency_write_threshold 1.2 Minimum saliency to trigger cloud LLM write
recollection_confidence_floor 0.6 Minimum URD edge confidence to include in recollection
recollection_recency_days 90 URD edges older than this are excluded
resolution_schedule 0 2 * * * Cron expression for nightly resolution job

SOAS — concept vocabulary and saliency

One row per token. No minimum token length — all tokens are indexed; frequency and novelty determine whether they surface. Common English words are pre-seeded at saliency 0 via a dictionary corpus. Dimensions are themselves SOAS concepts — no separate table needed.

Column Type Notes
id INT PK auto-increment
token VARCHAR unique, lowercase normalised
encounter_count INT raw count across all intercepted prompts
last_seen TIMESTAMP for recency tracking
saliency FLOAT log-scaled encounter saliency; 0 = common English
novelty FLOAT domain-specificity score; set to 1.0 when first confirmed by cloud LLM write path, 0 for pre-seeded dictionary words

Saliency vs novelty are distinct scores serving different purposes:

  • saliency measures how frequently a concept appears — it drives read-threshold triggering
  • novelty measures how domain-specific a concept is — a system name that rarely appears but is clearly project-specific should still be treated as important

URD table — the acyclic concept graph

Named urd (the SQL reserved word IN cannot be used as a table name). All three FK columns reference SOAS, so dimensions are concepts in the same vocabulary — new dimensions emerge from the same token space without schema changes.

Column Type Notes
id INT FK references SOAS — the concept being placed
parent_id INT FK references SOAS — the containing concept
dim_id INT FK references SOAS — which dimension this edge belongs to
is_isa BOOLEAN true = ISA (type/classification); false = ISPART (membership/containment). Outside the index — does not affect collision detection but drives conflict resolution semantics and recollection rendering
confidence FLOAT reliability of this edge, 0.01.0; set by cloud LLM at write time, updated by conflict resolver
last_confirmed TIMESTAMPTZ when was this edge last corroborated by an intercepted prompt; used for recency decay in recollection injection
source VARCHAR cloud_llm (saliency write path), inferred (cue pattern), festinger (resolution job), gutask (gutask iknowthat)

Index structure:

  • PK: (id, parent_id, dim_id) — full triple, prevents duplicate edges
  • Unique index: (id, dim_id) — one parent per concept per dimension; this is the acyclicity and contradiction-resistance mechanism
  • Root nodes: rows where id = parent_id = dim_id — the named root of a dimension tree; the one allowed self-reference

Single relation: IN — "this concept is semantically contained within that concept, within this dimension." ISA and ISPART are not separate relations or tables; they are the same IN relation, with is_isa annotating which flavour the edge represents.

In-Memory Cache Layer

The proxy maintains three in-memory structures populated at startup from Postgres. All read operations hit these structures only — zero network on the hot path. Writes are write-through: in-memory first, then Postgres async (saliency updates) or sync (URD inserts).

# SOAS — primary lookup by token string (mirrors UNIQUE index on token)
soas_by_token: dict[str, SoasRow]

# SOAS — reverse lookup by id (for pre-joining URD results)
soas_by_id: dict[int, str]

# URD — recollection reads: concept_id → list of edges (tokens pre-joined)
urd_by_concept: dict[int, list[UrdEdge]]

# URD — collision detection: (concept_id, dim_id) → edge
# Mirrors the Postgres UNIQUE index on (id, dim_id) exactly
urd_by_concept_dim: dict[tuple[int, int], UrdEdge]

# Resolution queue — concepts with pending conflicts (for ? marker in recollections)
pending_conflicts: set[int]

New SOAS token flow — ids always originate from Postgres:

token not in soas_by_token
  → INSERT into Postgres SOAS → Postgres returns auto-increment id
  → add to soas_by_token[token] and soas_by_id[id]
  → proceed with that id

URD insert flow — collision detected in-memory, Postgres is the safety net:

key = (concept_id, dim_id)
if key in urd_by_concept_dim:
    → collision detected in-memory
    → classify type (is_isa flags), route to resolution queue
    → return — no Postgres write attempted
else:
    → INSERT into Postgres URD
    → on success: update urd_by_concept[concept_id] and urd_by_concept_dim[key]
    → on UniqueViolation (race condition): reload row, route to resolution queue

Saliency update flow — batched to avoid per-token Postgres writes:

every token encounter → update soas_by_token[token].encounter_count in-memory
every 30 seconds      → flush encounter count deltas to Postgres in one batch UPDATE

Cache reload after nightly job — nightly job POSTs to /reload endpoint on proxy:

proxy receives /reload
  → re-SELECT all URD rows from Postgres with pre-joined tokens
  → rebuild urd_by_concept and urd_by_concept_dim
  → rebuild pending_conflicts from resolution queue
  → SOAS dict unchanged (nightly job does not modify SOAS)

This separation means tests can inject mock data directly into the dicts without touching Postgres, enabling full unit testing of collision detection, recollection rendering, and queue routing.


Canonical Recollection Query

The recollection engine executes this query for each salient concept found in an intercepted prompt. No chain traversal — depth is always 1. The result is a flat enumeration of all edges where the concept is the subject, across all dimensions.

SELECT
    u.id,
    u.parent_id,
    u.dim_id,
    u.is_isa,
    p.token  AS parent_token,
    d.token  AS dim_token
FROM urd u
INNER JOIN soas p ON p.id = u.parent_id
INNER JOIN soas d ON d.id = u.dim_id
WHERE u.id = $1
  AND u.confidence >= $2
  AND u.last_confirmed >= $3
ORDER BY u.id, u.dim_id DESC;

$1 = SOAS id of the concept, $2 = confidence floor (from config), $3 = recency cutoff (from config).

Injection position: the recollection block is prepended to the content of the existing system message if one is present. If no system message exists in the messages array, a new {"role": "system"} message containing only the recollection block is inserted at position 0. The system message is the highest-attention position in most instruction-tuned models — it is where grounding facts anchor most reliably.

Rendering: iterate rows; for each row emit [dim_token] parent_token. If the concept also has a pending entry in the resolution queue, append ? to that dimension token. Group by concept when multiple concepts are queried in a single block.

Zero-hit rendering: if a concept is above the read threshold but has no URD entries, it is a salient domain-specific term the world model has not yet encountered. Instead of silently omitting it, the recollection block emits an explicit prompt to the agent:

? gnommoweb: no recollection. If this is a typo, ignore.
  If you know what it is, store it before proceeding:
    gutask iknowthat 'gnommoweb -isa <parent> in context of <dimension>'
    gutask iknowthat 'gnommoweb -ispart <system> in context of <dimension>'

This turns every unknown salient concept into an active instruction. The agent either confirms it is a typo, asks for clarification from a human or peer agent, or fills the gap itself via gutask iknowthat. The world model grows organically through use.

Full example recollection block:

<recollection>
gnommoweb: [glitch_university] repo  [geography] ramanujan  [type?] service
dobby: [agent_pool] worker  [tech] python
? ramanujan: no recollection. If you know what it is, store it before proceeding:
    gutask iknowthat 'ramanujan -isa <parent> in context of <dimension>'
    gutask iknowthat 'ramanujan -ispart <system> in context of <dimension>'
</recollection>

[type?] signals a pending conflict on that dimension — the world model is not wrong, resolution is in progress.

gutask iknowthat — Manual Write Path

gutask iknowthat is the highest-confidence write path into URD. It bypasses the saliency threshold and the cloud LLM entirely.

Location: /Users/jenstandstad/Projects/gutasktool (sibling of Agent0 repo). The command POSTs to Festinger's /iknowthat HTTP endpoint — gutasktool does not connect to Postgres directly.

Syntax:

gutask iknowthat 'gnommoweb -isa repo in context of glitch_university'
gutask iknowthat 'gnommoweb -ispart glitch_university in context of membership'
  • -isa sets is_isa=true; -ispart sets is_isa=false
  • in context of <dimension> specifies the dimension token; defaults to type for -isa, membership for -ispart
  • All tokens run through the standard tokeniser (compound token rule applies)
  • Inserts into SOAS if any token is new
  • Inserts into URD with confidence=1.0, source=gutask
  • On collision: enters resolution queue with priority flag; gutask-source conflicts are reviewed first at /conflicts

Festinger /iknowthat endpoint: accepts POST with JSON body {fact: string}. Parses, tokenises, writes to SOAS/URD, returns the inserted or conflicted result. This decouples gutasktool from the Festinger schema.

This command is the agent's direct interface to the world model. When the recollection block surfaces a zero-hit concept, gutask iknowthat is the prescribed response.


The IN Relation and ISA/ISPART

Why a single operator

ISA and ISPART are both instances of a more general semantic containment relation. The dimension carries the semantic weight:

  • gnommoweb IN repo (dimension: type) → gnommoweb ISA repo
  • gnommoweb IN Glitch-University (dimension: membership) → gnommoweb ISPART Glitch-University
  • State IN Country (dimension: type) → State ISA a country-level granularity

The type dimension IS the ISA relation. Every other dimension IS an ISPART relation scoped to that domain. One table, one index, one operator.

Why this resolves the bleed

The classic bleed case: Michigan ISA State, Michigan ISPART USA, State ISPART Country.

With the single IN operator and dimensions:

Michigan  IN  State        (dimension: type)       is_isa: true
Michigan  IN  USA          (dimension: geography)   is_isa: false
State     IN  Country      (dimension: type)        is_isa: true
USA       IN  Country      (dimension: type)        is_isa: true

Two coherent chains, no collision, no ambiguity. "State ISPART Country" (class-level generalisation) becomes "State IN Country in the type dimension" — a perfectly valid ISA statement: State is a kind of country-level subdivision.


Collision Semantics — The is_isa Flag

The unique index on (concept_id, dimension_id) fires when a concept already has a parent in a given dimension. The is_isa flag on both the existing and incoming rows determines what the collision means:

Existing Incoming Interpretation Action
ISA ISA Dimension too coarse — both facts simultaneously true about the concept's nature Trigger dimension decomposition
ISPART ISPART Factual contradiction — thing can only be in one place per dimension Trigger arbitration (which is correct?)
ISA ISPART Dimension misclassification — these should have been in different dimensions Flag as misclassification, suggest correct dimension

Dimension Decomposition — Dimensions as Evolving Vocabulary

When an ISA+ISA collision occurs, the dimension is too coarse to hold both simultaneously-true facts. Example:

  • Existing: gnommoweb IN container (dimension: type, is_isa: true)
  • Incoming: gnommoweb IN repo (dimension: type, is_isa: true)

Both are true. The type dimension cannot hold them both. The conflict resolver sends this prompt to the cloud LLM:

"gnommoweb is already container in dimension type. New fact also places gnommoweb as repo in type. If both are simultaneously true, propose two more specific dimension names to replace type — one where gnommoweb as container remains valid, one where gnommoweb as repo is valid. Return JSON: {"existing_dimension": "...", "new_dimension": "..."}. Choose from this taxonomy where possible: [...]. Create new names only if nothing fits."

The LLM might return: {"existing_dimension": "deployment-type", "new_dimension": "artifact-type"}.

The system then:

  1. Inserts deployment-type and artifact-type into SOAS (if not present)
  2. Creates root nodes for each new dimension
  3. Inserts the new fact under artifact-type
  4. Leaves the existing type facts untouched — no migration

Dimensions are SOAS concepts. New dimensions emerge from the same token vocabulary. The graph grows its own taxonomy under pressure from real contradictions, starting coarse and decomposing on demand.


The Two Memory Operations

Tokenisation Rules

All text — prompts, system messages, agent outputs — passes through the same tokeniser before any saliency or relationship work is done.

Token extraction:

  1. Split on whitespace and punctuation boundaries
  2. Compound token rule: scan for runs of consecutive tokens where each begins with a capital letter. Merge the run into a single token, joined with underscores, then lowercase. This canonicalises proper nouns and multi-word concepts into single SOAS entries.
    • Glitch Universityglitch_university
    • Agent Zeroagent_zero
    • New York Citynew_york_city
    • A lowercase or short token breaks the run: the Glitch Universitythe breaks the run, Glitch Universityglitch_university
  3. Lowercase all tokens
  4. Keep tokens with 5 or more characters (strictly >4); discard shorter tokens unless they are part of a matched relationship cue pattern (see below)
  5. Strip leading/trailing punctuation from each token

Example:
"gnommoweb is a repo of Glitch University"
→ tokens: gnommoweb, repo, glitch_university
→ relationship extracted: gnommoweb IN repo IN dim:glitch_university (is_isa=true, via "is a … of" pattern)


Relationship Cue Patterns

Certain keyword patterns in intercepted text are direct cues to the memory layer that a semantic relationship is being expressed. The middleman scans every prompt for these patterns. When matched, the relationship is extracted and queued for insertion into the IN table — bypassing the saliency threshold, since the relationship has been made explicit.

Agents and humans interacting with Agent0 should be aware that using these patterns causes the middleman to build or update the world model.

ISA patterns (is_isa = true) — the subject is a type or instance of the object:

Pattern Example
{X} is a {Y} gnommoweb is a repo
{X} is an {Y} gnommoweb is an API
{X} ISA {Y} gnommoweb ISA repo
{X} is a kind of {Y} State is a kind of region
{X} is a type of {Y} gnommoweb is a type of service
{X} is an instance of {Y} dobby is an instance of agent
{X} kind of {Y} gnommoweb kind of repo
{X} type of {Y} gnommoweb type of service
{X} instance of {Y} dobby instance of agent

ISPART patterns (is_isa = false) — the subject is a member, part, or component of the object:

Pattern Example
{X} is part of {Y} gnommoweb is part of Glitch University
{X} ISPART {Y} gnommoweb ISPART glitch_university
{X} part of {Y} gnommoweb part of Agent0
{X} belongs to {Y} gnommoweb belongs to Glitch University
{X} is owned by {Y} gnommoweb is owned by jenstandstad
{X} owned by {Y} gnommoweb owned by jenstandstad
{X} member of {Y} dobby member of agent_pool
{X} is a member of {Y} dobby is a member of agent_pool
{X} runs on {Y} gnommoweb runs on Docker
{X} hosted by {Y} gnommoweb hosted by ramanujan
{X} deployed on {Y} gnommoweb deployed on Docker
{X} contained in {Y} gnommoweb contained in agent0_stack

The of {Z} dimension modifier:

When an ISA pattern is followed by of {Z}, the named entity {Z} becomes the dimension for the extracted edge. This allows natural language to directly specify the context in which a classification holds:

"gnommoweb is a repo of Glitch University"
→ gnommoweb IN repo IN dim:glitch_university  (is_isa=true)

"Michigan is a state of USA"
→ michigan IN state IN dim:usa  (is_isa=true)

Without the of {Z} modifier, the dimension defaults to type for ISA patterns and the most appropriate seed dimension for ISPART patterns (inferred by the cloud LLM during the write step, or defaulting to membership).

Cue-triggered writes bypass the saliency threshold. Explicit relationship cues are treated as high-confidence signals regardless of how many times a concept has previously appeared. The extracted triple goes directly into the write queue with source: inferred and a confidence score assigned by the pattern type (exact keyword cues score higher than positional inference).


Writing — cloud-triggered, async

When a concept's saliency crosses the write threshold, Middleman queues it for background processing (cloud LLM calls must not block the prompt response path):

  1. Call cloud LLM: "What is {concept}?" with the current dimension taxonomy as a closed list and a structured output prompt requesting (concept, parent, dimension, is_isa) triples
  2. Parse response into IN table INSERT statements
  3. Attempt inserts; route constraint violations to the conflict resolver
  4. Update novelty score in SOAS for the concept

Reading — spontaneous prompt enrichment

On every intercepted prompt:

  1. Tokenise the full prompt string, extract tokens >4 chars, normalise
  2. Look up each token in SOAS, update encounter counts and saliency
  3. For tokens above the read threshold, query the IN table for all edges involving the concept
  4. Traverse each chain upward (configurable max depth)
  5. Format as a <recollection> block and prepend to the prompt before forwarding to Ollama

Example output:

<recollection>
gnommoweb: [type] repo → software-artifact
           [membership] Glitch-University → Agent0-infrastructure
           [tech] FastAPI → Python
glitch.university: [type] platform → web-service
                   [membership] Agent0 → Glitch-Hunter-project
</recollection>

Only edges above the confidence threshold and within the recency window are included. The agent does not search for these — they appear spontaneously.

Read and write thresholds are separate and independently tunable. Reading (DB lookup) is cheap; writing (cloud LLM call) is expensive. The write threshold should be meaningfully higher than the read threshold.


Conflict Resolution — The Nightly Therapy Model

The IN table is rigorous and autistic: it cannot hold contradictions. Any collision is immediately routed to the resolution queue — a separate table — where it waits for the nightly resolution job to process it. During this period the world model stands unchanged; the old fact continues to be served in recollections, marked with a ? to signal pending dissonance.

On collision (immediate, synchronous)

  1. Classify the collision type by reading is_isa on both rows: ISA+ISA, ISPART+ISPART, or misclassification
  2. Insert the rejected fact into the resolution queue with full context: existing edge, incoming edge, dimension, collision type, timestamp
  3. Return normally — the proxy response is not blocked

During the day (recollection engine)

Concepts with entries in the resolution queue are rendered with a ? marker:

<recollection>
gnommoweb: [type] container
           [type?] repo — pending resolution
</recollection>

The agent sees that a fact is contested. The world model is not wrong — it is incomplete. The ? marker disappears after the nightly job resolves or dismisses the conflict.

Nightly resolution job

Runs as a background thread inside the Festinger proxy process on the schedule set in the config table (resolution_schedule). Can also be triggered manually via POST /resolve/run — a corresponding button is exposed in the Festinger admin UI. Uses the model configured in resolve_model_id.

For each item in the queue, calls the configured LLM with both facts and the collision type, receives a structured decision, and applies it:

ISA+ISA collision (dimension too coarse):

  • LLM outcome A — decompose: suggest two new dimension names. System creates new SOAS entries and root nodes, inserts both facts into their respective new dimensions, marks queue item resolved.
  • LLM outcome B — dismiss: the incoming fact is noise or wrong. Queue item marked dismissed. World model stands.

ISPART+ISPART collision (factual contradiction):

  • LLM outcome A — update: the incoming fact is more current. System removes old IN edge, inserts new one, marks queue item resolved.
  • LLM outcome B — dismiss: the existing fact is still correct. Queue item marked dismissed.

Misclassification (ISA+ISPART in same dimension):

  • LLM suggests the correct dimension for the incoming fact. System inserts it in the corrected dimension (no collision), marks queue item resolved.

Resolution queue schema

Column Type Notes
id INT PK auto-increment
concept_id INT FK references SOAS
existing_parent_id INT FK the parent currently in the IN table
incoming_parent_id INT FK the rejected parent
dimension_id INT FK the dimension where the collision occurred
collision_type ENUM isa_isa, ispart_ispart, misclassification
status ENUM pending, resolved, dismissed
resolution TEXT JSON record of what the nightly job decided and did
created_at TIMESTAMP when the collision occurred
resolved_at TIMESTAMP when the nightly job processed it

Properties of this model

  • IN table is always consistent — agents never receive contradictory recollections from confirmed facts
  • Resolution is deliberate, not reactive — the nightly job processes dissonance calmly, with full context, not under prompt-response time pressure
  • Dismissal is a first-class outcome — not every collision is a real problem; the LLM can decide the world model is correct and the incoming fact was noise
  • The ? marker is the fuzziness adjunct — it surfaces uncertainty to agents without compromising the graph's integrity
  • /conflicts endpoint — exposes the full queue (pending and recently resolved) for human inspection and override

Graph Properties

  • Acyclic: the unique index on (concept_id, dimension_id) enforces single-parent per concept per dimension, making each dimension a forest of trees — structurally acyclic without any runtime check
  • Shared: one graph for all agents — recollections represent shared facts about the project, not per-agent beliefs. Agents already have per-agent memory in Agent Zero's own FAISS layer
  • Contradiction-resistant: the index makes it structurally impossible to store conflicting facts in the same dimension. Contradictions surface as insert failures, not silent overwrites
  • Self-organising: dimension taxonomy starts with a small seed list and decomposes on demand. No human needs to pre-define the full ontology

Saliency Decay

Two decay mechanisms, operating independently:

  • SOAS recency (last_seen timestamp): concepts not seen for a long time are deprioritised for recollection injection but not deleted. If gnommoweb reappears in a prompt, last_seen updates and its recollections resurface immediately
  • IN edge recency (last_confirmed timestamp): edges not corroborated by recent prompts are given lower weight in recollection injection, making the recollection block favour currently-relevant facts

Decay does not delete knowledge — it adjusts injection priority. The graph remains intact.


Dimension Taxonomy — Seed List

A small, orthogonal set of initial dimensions. Each dimension answers a specific question about a concept. New dimensions emerge through decomposition; this list is the starting skeleton.

Dimension Question is_isa
type What kind of thing is this? true
membership What system or project does this belong to? false
runs-on What infrastructure hosts or executes this? false
tech What technology stack is it built with? false
owned-by Who is responsible for this? false
geography Where is this spatially or organisationally located? false

Root nodes for each dimension are seeded at bootstrap. The type dimension is expected to be the first to decompose as domain-specific concepts accumulate.


Components

Component Description State
Proxy core FastAPI Ollama-compatible HTTP proxy built
Loop detector Session-scoped repeat detection + mitigations built
Config system Hot-reloading YAML config built
SOAS store Concept vocabulary + saliency DB table built
IN table store Acyclic concept graph with correct indexes built
Dictionary bootstrap Pre-seed SOAS with common English at saliency 0 built
Dimension bootstrap Seed root nodes for initial dimension taxonomy built
Saliency engine Tokenise prompt, score tokens, update SOAS counts built
Recollection engine Query IN table, traverse chains, format + inject block built
Memory writer Write-threshold trigger → async cloud LLM → NL→IN parse → insert built
Conflict resolver On collision: classify type, insert into resolution queue immediately built
Resolution queue Pending/resolved/dismissed conflicts with full context built
Nightly resolution job Drain queue via cloud LLM; apply decompose/update/dismiss decisions built
/conflicts endpoint Expose queue (pending + recent) for human inspection and override built
Persistence Postgres from day one; English dictionary pre-loaded into SOAS at init built

Task Breakdown

Phase 1 — Foundation (complete)

  • T01 Proxy core: FastAPI Ollama-compatible server forwarding /api/chat and /api/generate
  • T02 Loop detector: session-scoped exact-match repetition detection
  • T03 Mitigations: temperature boost, forbidden action injection, history truncation, circuit breaker
  • T04 Hot-reload YAML config
  • T05 Docker container + docker-compose service entry

Phase 2 — Persistence Layer

  • T06 Postgres service: add postgres container to docker-compose; connection config in config.yaml
  • T07 models table: id SERIAL PK, provider VARCHAR, model_name VARCHAR, api_key VARCHAR, created_at TIMESTAMPTZ
  • T08 config table: key VARCHAR PK, value TEXT, updated_at TIMESTAMPTZ; seed with default values for all config keys
  • T09 SOAS table: id SERIAL PK, token VARCHAR UNIQUE, encounter_count INT default 0, last_seen TIMESTAMPTZ, saliency FLOAT default 0, novelty FLOAT default 0. All tokens lowercase. Unique index on token.
  • T10 URD table: id INT FK → soas, parent_id INT FK → soas, dim_id INT FK → soas, is_isa BOOLEAN, confidence FLOAT, last_confirmed TIMESTAMPTZ, source VARCHAR. PK (id, parent_id, dim_id). Unique index (id, dim_id).
  • T11 Resolution queue table: id SERIAL PK, concept_id INT FK → soas, existing_parent_id INT FK → soas, incoming_parent_id INT FK → soas, dim_id INT FK → soas, collision_type VARCHAR, status VARCHAR default 'pending', resolution JSONB, created_at TIMESTAMPTZ, resolved_at TIMESTAMPTZ
  • T12 English dictionary bootstrap: bulk-load word list into SOAS with saliency=0, novelty=0, encounter_count=0 on container init. Skip existing tokens.
  • T13 Dimension bootstrap: insert SOAS entries and self-referential URD root nodes (id = parent_id = dim_id) for the 6 seed dimensions

Phase 3 — Saliency Engine and Prompt Parsing

  • T14 Tokeniser: split on whitespace/punctuation; apply compound token rule (consecutive capitalised tokens → single underscore-joined lowercase token); no minimum length; strip punctuation; lowercase all
  • T15 Relationship cue scanner: regex/pattern scan for ISA and ISPART cue patterns; extract (subject, parent, dimension_modifier, is_isa) triples; handle of {Z} dimension modifier
  • T16 SOAS lookup + update: increment encounter_count, update last_seen, recalculate saliency (log scale) for all extracted tokens; read thresholds from config table
  • T17 Threshold evaluation: read saliency_read_threshold and saliency_write_threshold from config; cue-extracted triples bypass write-threshold entirely

Phase 4 — Recollection Engine (Read Path)

  • T16 URD query: for each above-read-threshold token, execute the canonical recollection query; filter by confidence floor and last_confirmed recency window
  • T17 Recollection formatter — hit path: enumerate query rows; group by concept; render each edge as [dim_token] parent_token; append ? for edges with a pending resolution queue entry
  • T18 Recollection formatter — zero-hit path: for salient concepts with no URD rows, emit the ? concept: no recollection prompt block including the gutask iknowthat usage hint
  • T19 Prompt injection: prepend recollection block before forwarding to Ollama
  • T20 Recollection config: max concepts per block, confidence floor, recency window, injection position

Phase 5 — Memory Writer (Write Path)

  • T21 POST /iknowthat endpoint: accept {fact: string}, parse -isa/-ispart flags and in context of clause, run tokeniser, upsert SOAS, insert into URD with confidence=1.0, source=gutask; route collisions to resolution queue with priority flag
  • T22 gutask iknowthat command (in /Users/jenstandstad/Projects/gutasktool): parse fact string, POST to Festinger /iknowthat, surface result to agent
  • T23 Write queue: async background queue for concepts crossing the write threshold; cue-extracted triples enter directly regardless of threshold
  • T24 LLM client: support claude and openai providers; load provider/model/key from models table via write_model_id config key; structured prompt requesting (concept, parent, dimension, is_isa, confidence) triples as JSON
  • T25 NL→IN parser: validate triples against SOAS and known dimensions; create new SOAS entries for unknown tokens; apply compound token rule
  • T26 URD insert pipeline: check urd_by_concept_dim in-memory first; on miss attempt Postgres insert; on hit or UniqueViolation route to conflict resolver; set source field per write path

Phase 6 — Conflict Resolution

  • T25 Collision handler: on unique constraint violation, classify type via is_isa flags, insert immediately into resolution queue
  • T26 Recollection engine update: check resolution queue for pending items per concept; render pending edges with [dim?] marker
  • T27 Nightly resolution job: background thread, schedule from config table (resolution_schedule); for each pending queue item, call LLM configured in resolve_model_id with both facts and collision type, receive JSON decision (decompose / update / dismiss)
  • T28 Resolution applicator — decompose: create new dimension SOAS entries and root nodes; insert both facts in respective new dimensions; mark queue item resolved
  • T29 Resolution applicator — update: remove old IN edge, insert new fact, mark queue item resolved
  • T30 Resolution applicator — dismiss: mark queue item dismissed; world model unchanged; the [dim?] marker disappears from recollections
  • T31 /conflicts endpoint: list pending and recently resolved/dismissed items with full context; support human override (force-dismiss, force-resolve)
  • T32 POST /resolve/run endpoint: manually trigger the nightly resolution job outside of its schedule
  • T33 Admin UI: minimal HTML page served by Festinger at /admin; shows pending conflicts count, last resolution run timestamp, and a "Run resolution now" button wired to POST /resolve/run

Phase 7 — Hardening

  • T32 Latency guard: tokenisation + saliency lookup + recollection query must not add >50ms to prompt round-trip; use connection pooling
  • T33 Write path fully async: all cloud LLM calls, IN inserts, and queue operations run in background workers; proxy response never waits
  • T34 Integration test: plain prompt → compound token extraction → saliency update → recollection injection → Ollama round trip
  • T35 Integration test: cue pattern in prompt ("gnommoweb is a repo of Glitch University") → extracted triple bypasses threshold → IN insert → recollection on next prompt
  • T36 Integration test: write threshold → cloud LLM write → IN insert → recollection on next prompt
  • T37 Integration test: ISA+ISA collision → immediate queue insert → [type?] marker in recollection → nightly job → decompose → clean recollection across two dimensions
  • T38 Integration test: ISPART+ISPART collision → queue insert → nightly job → dismiss → world model unchanged, marker gone

Resolved Design Decisions

Question Decision
Single relation or ISA+ISPART tables? Single IN operator; is_isa boolean flag outside the index annotates edge type
ISA vs ISPART bleed Resolved by dimension: type dimension = ISA; all other dimensions = ISPART
Shared vs per-agent graph Shared — recollections are project-wide facts; per-agent memory remains in Agent Zero's FAISS layer
Saliency decay Dual mechanism: SOAS last_seen for concept recency; IN last_confirmed for edge recency. Decay adjusts injection priority, never deletes knowledge
Recollection depth Depth = 1, no chain traversal. Single flat query against URD for direct edges of each salient concept. Property-loop problem is dissolved by design — self-referential chains cannot arise without traversal.
Write path blocking Fully async — cloud LLM calls queued in background, never block proxy response
Dimension taxonomy Seed list of 6 orthogonal dimensions; decomposes on demand via nightly resolution job when ISA+ISA collision is queued
Collision semantics ISA+ISA → dimension too coarse → decompose. ISPART+ISPART → factual contradiction → arbitrate. ISA+ISPART → misclassification → flag
Contradiction resistance Structural — unique index on (concept_id, dimension_id) makes conflicting facts physically uninsertable in the same dimension
New dimensions Emerge as SOAS tokens; no schema change needed; root nodes created at decomposition time
Conflict resolution timing Immediate queue insert on collision; nightly job drains the queue via cloud LLM; outcomes are decompose / update / dismiss
Database Postgres from day one — no SQLite POC. English dictionary bulk-loaded into SOAS at container init.
Tokenisation Tokens ≥5 chars, lowercased. Consecutive capitalised tokens merged into single underscore-joined token (Glitch Universityglitch_university).
Relationship cue parsing ISA and ISPART keyword patterns in intercepted text trigger direct triple extraction, bypassing saliency threshold. of {Z} modifier sets the dimension.
Cue-triggered writes Explicit cue patterns are high-confidence signals; extracted triples go straight to the write queue with source: inferred, no threshold gate.
Zero-hit recollection Salient concepts with no URD entries emit a ? concept: no recollection prompt block instructing the agent to clarify or store the fact via gutask iknowthat.
Manual write path gutask iknowthat 'X -isa/-ispart Y in context of Z' inserts directly into URD with confidence=1.0, source=manual, bypassing saliency threshold and cloud LLM.
In-memory layer SOAS and URD cached in Python dicts at startup. Reads are zero-network. IDs always originate from Postgres. Collision detection uses dict[(concept_id, dim_id)] mirroring the Postgres unique index. Postgres is the safety net for race conditions.
Token length No minimum — all tokens indexed. Frequency and novelty determine whether they surface. Optimize later if needed.
Nightly job execution Background thread inside proxy process. Triggered by cron schedule (config table) or manually via POST /resolve/run + admin UI button.
Recollection injection Prepended to existing system message content. If no system message exists, a new one is inserted at position 0. System message position provides the strongest grounding anchor for instruction-tuned models.
LLM configuration models table (provider, model_name, api_key). config table keys write_model_id and resolve_model_id select which model each purpose uses. Supports claude and openai providers.
Source values cloud_llm (saliency write path), inferred (cue pattern extraction), festinger (nightly resolution job), gutask (gutask iknowthat command).
gutask iknowthat interface gutasktool at /Users/jenstandstad/Projects/gutasktool. Command POSTs to Festinger's /iknowthat endpoint — no direct Postgres access from gutasktool.

Test Cases

The in-memory cache layer enables full unit testing without a live database. Tests pre-populate soas_by_token, soas_by_id, urd_by_concept, urd_by_concept_dim, and pending_conflicts directly, then exercise the logic under test and assert on the resulting state.


Test A — Prompt includes a concept not in the cache

Scenario: An agent sends a prompt referencing gnommoweb. The concept exists nowhere in SOAS or URD.

Setup:

soas_by_token = {}   # empty — concept is completely unknown
urd_by_concept = {}
urd_by_concept_dim = {}
pending_conflicts = set()

Input prompt: "Please update gnommoweb to use FastAPI instead"

Expected behaviour:

  1. Tokeniser extracts: gnommoweb (7 chars ✓), fastapi (7 chars ✓), please (6 chars but common English → saliency 0), update (6 chars, common → saliency 0), instead (7 chars, common → saliency 0)
  2. gnommoweb and fastapi not in soas_by_token → both are new tokens. In a test with Postgres mocked, the mock returns id=101 for gnommoweb and id=102 for fastapi. Both added to SOAS dicts.
  3. Saliency for both: log(1) — first encounter, below read threshold
  4. URD lookup: skipped (below threshold) — no recollection block emitted for these concepts

Edge case variant — above threshold: pre-seed soas_by_token["gnommoweb"] with encounter_count=50, saliency=0.9 (above read threshold) but keep urd_by_concept empty.

Expected behaviour (variant):

  1. Saliency lookup: above read threshold ✓
  2. URD lookup: urd_by_concept.get(101, []) → empty list
  3. Zero-hit path triggered
  4. Recollection block contains:
? gnommoweb: no recollection. If not a typo, store it before proceeding:
  gutask iknowthat 'gnommoweb -isa <parent> in context of <dimension>'
  gutask iknowthat 'gnommoweb -ispart <system> in context of <dimension>'

Assertions:

  • pending_conflicts unchanged (no collision occurred)
  • Resolution queue empty
  • Recollection block contains ? gnommoweb
  • Prompt forwarded to Ollama with recollection block prepended

Test B — Prompt includes "A ISA B" conflicting with existing world model

Scenario: The world model already holds gnommoweb ISA repo in the type dimension. A new prompt contains the explicit cue "gnommoweb is a container", creating an ISA+ISA collision.

Setup:

# SOAS
soas_by_token = {
    "gnommoweb":  SoasRow(id=101, saliency=1.2, novelty=1.0),
    "repo":       SoasRow(id=201, saliency=0.8, novelty=0.5),
    "container":  SoasRow(id=202, saliency=0.6, novelty=0.4),
    "type":       SoasRow(id=1,   saliency=0.0, novelty=0.0),  # seed dimension
}
soas_by_id = {101: "gnommoweb", 201: "repo", 202: "container", 1: "type"}

# URD — gnommoweb ISA repo IN type (existing confirmed fact)
existing_edge = UrdEdge(concept_id=101, parent_id=201, dim_id=1,
                        is_isa=True, confidence=0.9, source="cloud_llm")
urd_by_concept     = {101: [existing_edge]}
urd_by_concept_dim = {(101, 1): existing_edge}
pending_conflicts  = set()

Input prompt: "gnommoweb is a container deployed on Docker"

Expected behaviour:

  1. Cue scanner matches "is a" pattern: extracts triple (gnommoweb, container, type, is_isa=True)
  2. Collision check: (101, 1) found in urd_by_concept_dim → collision
  3. Existing edge is_isa=True, incoming is_isa=True → ISA+ISA collision type
  4. Resolution queue insert (mocked Postgres): {concept_id=101, existing_parent_id=201, incoming_parent_id=202, dim_id=1, collision_type="isa_isa", status="pending"}
  5. pending_conflicts.add(101)
  6. No URD modification — world model unchanged

Recollection block for this prompt:

<recollection>
gnommoweb: [type?] repo — conflict pending
</recollection>

Assertions:

  • urd_by_concept_dim[(101, 1)] still points to the original repo edge (unchanged)
  • pending_conflicts == {101}
  • Resolution queue has exactly one entry with collision_type="isa_isa" and status="pending"
  • No Postgres URD insert attempted
  • Recollection renders [type?] marker for gnommoweb

Test C — Nightly job: queue processing and dimension decomposition

Scenario: The resolution queue contains the ISA+ISA collision from Test B. The nightly job runs, calls the cloud LLM (mocked), receives a decomposition decision, and updates the world model.

Setup: state as at end of Test B, plus:

resolution_queue = [
    QueueEntry(id=1, concept_id=101, existing_parent_id=201, incoming_parent_id=202,
               dim_id=1, collision_type="isa_isa", status="pending")
]

Mocked cloud LLM response:

{
  "decision": "decompose",
  "existing_dimension": "artifact-type",
  "new_dimension": "deployment-type",
  "reasoning": "repo describes what gnommoweb is as a software artifact; container describes how it is deployed"
}

Expected behaviour:

  1. Nightly job fetches pending queue entries from Postgres
  2. For entry id=1, calls cloud LLM → receives decompose decision
  3. New dimensions: insert artifact-type into Postgres SOAS → returns id=401; insert deployment-type → returns id=402. Add both to SOAS dicts.
  4. Root nodes: insert (401,401,401) and (402,402,402) into Postgres URD (self-referential dimension roots)
  5. Migrate existing edge: delete (101, 201, 1) from URD, insert (101, 201, 401) — gnommoweb ISA repo IN artifact-type
  6. Insert new edge: insert (101, 202, 402) — gnommoweb ISA container IN deployment-type
  7. Update in-memory cache:
    • soas_by_token["artifact-type"] = SoasRow(id=401, ...)
    • soas_by_token["deployment-type"] = SoasRow(id=402, ...)
    • Remove urd_by_concept_dim[(101, 1)], add [(101, 401)] and [(101, 402)]
    • Update urd_by_concept[101] to two new edges
    • pending_conflicts.discard(101)
  8. Mark queue entry resolved with resolution JSON
  9. Signal proxy /reload (or nightly job updates cache directly if in-process)

Final recollection block for gnommoweb:

<recollection>
gnommoweb: [artifact-type] repo  [deployment-type] container
</recollection>

Assertions:

  • urd_by_concept_dim contains (101, 401) and (101, 402), not (101, 1)
  • urd_by_concept[101] has exactly two edges
  • pending_conflicts does not contain 101
  • soas_by_token contains artifact-type and deployment-type with ids from Postgres
  • Resolution queue entry has status="resolved" and non-null resolution JSON
  • No [type?] marker in subsequent recollection for gnommoweb