glitch-university/agent0

Fork 0

Files

T

gitprov 8ff73d32ae Adding Festinger with wordnet

2026-04-19 16:16:13 +02:00

49 KiB

Raw Blame History

Festinger — Agent0 Inference Middleware

Status: In progress — iterative specification
Owner: jenstandstad
Location: plugins/festinger/

Named after Leon Festinger (1919–1989), social psychologist who introduced the theory of cognitive dissonance in 1957. Festinger observed that minds — human or artificial — cannot comfortably hold contradictory beliefs simultaneously, and that the tension this creates drives resolution. This system is built on the same principle.

Purpose

Festinger is an Ollama-compatible HTTP proxy that sits between Agent0's agent-zero containers and the local Ollama inference endpoint. It solves two related problems that emerge with local inference:

Reasoning loops — agents repeat the same output and cannot break out, even when the framework tells them to try something else.
Stale and incoherent memory — Agent Zero's FAISS-based memory accumulates facts without contradiction detection, causing agents to act on outdated or conflicting beliefs.

Festinger addresses both at the inference layer, transparently, without modifying agent-zero internals. The memory layer it introduces is called Recollections — short, structured, non-contradicting facts injected spontaneously into every prompt as context enrichment. Agents do not search for recollections; they appear automatically.

Like its namesake's theory, Festinger treats contradiction not as an error to suppress but as a signal to act on.

Architecture

agent-zero containers
        │
        ▼
┌───────────────────────────────────────────┐
│            Festinger Proxy               │
│                                           │
│  ┌─────────────┐    ┌──────────────────┐  │
│  │ Loop        │    │ Saliency Engine  │  │
│  │ Detector    │    │ (tokenise+score) │  │
│  └─────────────┘    └────────┬─────────┘  │
│                              │            │
│                    ┌─────────▼─────────┐  │
│                    │  SOAS             │  │
│                    │  concept vocab    │  │
│                    │  + saliency store │  │
│                    └─────────┬─────────┘  │
│                              │            │
│          ┌───────────────────┤            │
│          │                   │            │
│  ┌───────▼──────┐  ┌────────▼─────────┐  │
│  │ Recollection │  │ Memory Writer    │  │
│  │ Engine       │  │ (cloud LLM +     │  │
│  │ (read IN  →  │  │  NL → IN parser) │  │
│  │  inject      │  └────────┬─────────┘  │
│  │  <recollec-  │           │            │
│  │  tion> block)│  ┌────────▼─────────┐  │
│  └──────────────┘  │ Conflict         │  │
│          │         │ Resolver         │  │
│          │         └────────┬─────────┘  │
│          └──────────┬───────┘            │
│                     │                    │
│                    ┌▼──────────────────┐  │
│                    │  IN table         │  │
│                    │  acyclic concept  │  │
│                    │  graph            │  │
│                    └───────────────────┘  │
└───────────────────────────────────────┬──┘
                                        │
                                        ▼
                                Ollama (host)
                                        │
                                        ▼ (write path only)
                                Cloud LLM API

Data Model

models — LLM provider configuration

Festinger uses a cloud LLM for two purposes: the saliency-triggered write path and the nightly resolution job. Each purpose can use a different model.

Column	Type	Notes
id	SERIAL PK
provider	VARCHAR	`claude` or `openai`
model_name	VARCHAR	e.g. `claude-opus-4-6`, `gpt-4o`
api_key	VARCHAR	stored encrypted at rest
created_at	TIMESTAMPTZ

config — runtime configuration

Key-value store for settings changeable without redeployment.

Key	Default	Purpose
`write_model_id`	—	FK into models; used for saliency-triggered write path
`resolve_model_id`	—	FK into models; used for nightly resolution job
`saliency_read_threshold`	`0.5`	Minimum saliency to trigger recollection lookup
`saliency_write_threshold`	`1.2`	Minimum saliency to trigger cloud LLM write
`recollection_confidence_floor`	`0.6`	Minimum URD edge confidence to include in recollection
`recollection_recency_days`	`90`	URD edges older than this are excluded
`resolution_schedule`	`0 2 * * *`	Cron expression for nightly resolution job

SOAS — concept vocabulary and saliency

One row per token. No minimum token length — all tokens are indexed; frequency and novelty determine whether they surface. Common English words are pre-seeded at saliency 0 via a dictionary corpus. Dimensions are themselves SOAS concepts — no separate table needed.

Column	Type	Notes
id	INT PK	auto-increment
token	VARCHAR	unique, lowercase normalised
encounter_count	INT	raw count across all intercepted prompts
last_seen	TIMESTAMP	for recency tracking
saliency	FLOAT	log-scaled encounter saliency; 0 = common English
novelty	FLOAT	domain-specificity score; set to 1.0 when first confirmed by cloud LLM write path, 0 for pre-seeded dictionary words

Saliency vs novelty are distinct scores serving different purposes:

saliency measures how frequently a concept appears — it drives read-threshold triggering
novelty measures how domain-specific a concept is — a system name that rarely appears but is clearly project-specific should still be treated as important

URD table — the acyclic concept graph

Named urd (the SQL reserved word IN cannot be used as a table name). All three FK columns reference SOAS, so dimensions are concepts in the same vocabulary — new dimensions emerge from the same token space without schema changes.

Column	Type	Notes
id	INT FK	references SOAS — the concept being placed
parent_id	INT FK	references SOAS — the containing concept
dim_id	INT FK	references SOAS — which dimension this edge belongs to
is_isa	BOOLEAN	true = ISA (type/classification); false = ISPART (membership/containment). Outside the index — does not affect collision detection but drives conflict resolution semantics and recollection rendering
confidence	FLOAT	reliability of this edge, 0.0–1.0; set by cloud LLM at write time, updated by conflict resolver
last_confirmed	TIMESTAMPTZ	when was this edge last corroborated by an intercepted prompt; used for recency decay in recollection injection
source	VARCHAR	`cloud_llm` (saliency write path), `inferred` (cue pattern), `festinger` (resolution job), `gutask` (gutask iknowthat)

Index structure:

PK: (id, parent_id, dim_id) — full triple, prevents duplicate edges
Unique index: (id, dim_id) — one parent per concept per dimension; this is the acyclicity and contradiction-resistance mechanism
Root nodes: rows where id = parent_id = dim_id — the named root of a dimension tree; the one allowed self-reference

Single relation: IN — "this concept is semantically contained within that concept, within this dimension." ISA and ISPART are not separate relations or tables; they are the same IN relation, with is_isa annotating which flavour the edge represents.

In-Memory Cache Layer

The proxy maintains three in-memory structures populated at startup from Postgres. All read operations hit these structures only — zero network on the hot path. Writes are write-through: in-memory first, then Postgres async (saliency updates) or sync (URD inserts).

# SOAS — primary lookup by token string (mirrors UNIQUE index on token)
soas_by_token: dict[str, SoasRow]

# SOAS — reverse lookup by id (for pre-joining URD results)
soas_by_id: dict[int, str]

# URD — recollection reads: concept_id → list of edges (tokens pre-joined)
urd_by_concept: dict[int, list[UrdEdge]]

# URD — collision detection: (concept_id, dim_id) → edge
# Mirrors the Postgres UNIQUE index on (id, dim_id) exactly
urd_by_concept_dim: dict[tuple[int, int], UrdEdge]

# Resolution queue — concepts with pending conflicts (for ? marker in recollections)
pending_conflicts: set[int]

New SOAS token flow — ids always originate from Postgres:

token not in soas_by_token
  → INSERT into Postgres SOAS → Postgres returns auto-increment id
  → add to soas_by_token[token] and soas_by_id[id]
  → proceed with that id

URD insert flow — collision detected in-memory, Postgres is the safety net:

key = (concept_id, dim_id)
if key in urd_by_concept_dim:
    → collision detected in-memory
    → classify type (is_isa flags), route to resolution queue
    → return — no Postgres write attempted
else:
    → INSERT into Postgres URD
    → on success: update urd_by_concept[concept_id] and urd_by_concept_dim[key]
    → on UniqueViolation (race condition): reload row, route to resolution queue

Saliency update flow — batched to avoid per-token Postgres writes:

every token encounter → update soas_by_token[token].encounter_count in-memory
every 30 seconds      → flush encounter count deltas to Postgres in one batch UPDATE

Cache reload after nightly job — nightly job POSTs to /reload endpoint on proxy:

proxy receives /reload
  → re-SELECT all URD rows from Postgres with pre-joined tokens
  → rebuild urd_by_concept and urd_by_concept_dim
  → rebuild pending_conflicts from resolution queue
  → SOAS dict unchanged (nightly job does not modify SOAS)

This separation means tests can inject mock data directly into the dicts without touching Postgres, enabling full unit testing of collision detection, recollection rendering, and queue routing.

Canonical Recollection Query

The recollection engine executes this query for each salient concept found in an intercepted prompt. No chain traversal — depth is always 1. The result is a flat enumeration of all edges where the concept is the subject, across all dimensions.

SELECT
    u.id,
    u.parent_id,
    u.dim_id,
    u.is_isa,
    p.token  AS parent_token,
    d.token  AS dim_token
FROM urd u
INNER JOIN soas p ON p.id = u.parent_id
INNER JOIN soas d ON d.id = u.dim_id
WHERE u.id = $1
  AND u.confidence >= $2
  AND u.last_confirmed >= $3
ORDER BY u.id, u.dim_id DESC;

$1 = SOAS id of the concept, $2 = confidence floor (from config), $3 = recency cutoff (from config).

Injection position: the recollection block is prepended to the content of the existing system message if one is present. If no system message exists in the messages array, a new {"role": "system"} message containing only the recollection block is inserted at position 0. The system message is the highest-attention position in most instruction-tuned models — it is where grounding facts anchor most reliably.

Rendering: iterate rows; for each row emit [dim_token] parent_token. If the concept also has a pending entry in the resolution queue, append ? to that dimension token. Group by concept when multiple concepts are queried in a single block.

Zero-hit rendering: if a concept is above the read threshold but has no URD entries, it is a salient domain-specific term the world model has not yet encountered. Instead of silently omitting it, the recollection block emits an explicit prompt to the agent:

? gnommoweb: no recollection. If this is a typo, ignore.
  If you know what it is, store it before proceeding:
    gutask iknowthat 'gnommoweb -isa <parent> in context of <dimension>'
    gutask iknowthat 'gnommoweb -ispart <system> in context of <dimension>'

This turns every unknown salient concept into an active instruction. The agent either confirms it is a typo, asks for clarification from a human or peer agent, or fills the gap itself via gutask iknowthat. The world model grows organically through use.

Full example recollection block:

<recollection>
gnommoweb: [glitch_university] repo  [geography] ramanujan  [type?] service
dobby: [agent_pool] worker  [tech] python
? ramanujan: no recollection. If you know what it is, store it before proceeding:
    gutask iknowthat 'ramanujan -isa <parent> in context of <dimension>'
    gutask iknowthat 'ramanujan -ispart <system> in context of <dimension>'
</recollection>

[type?] signals a pending conflict on that dimension — the world model is not wrong, resolution is in progress.

gutask iknowthat — Manual Write Path

gutask iknowthat is the highest-confidence write path into URD. It bypasses the saliency threshold and the cloud LLM entirely.

Location: /Users/jenstandstad/Projects/gutasktool (sibling of Agent0 repo). The command POSTs to Festinger's /iknowthat HTTP endpoint — gutasktool does not connect to Postgres directly.

Syntax:

gutask iknowthat 'gnommoweb -isa repo in context of glitch_university'
gutask iknowthat 'gnommoweb -ispart glitch_university in context of membership'

-isa sets is_isa=true; -ispart sets is_isa=false
in context of <dimension> specifies the dimension token; defaults to type for -isa, membership for -ispart
All tokens run through the standard tokeniser (compound token rule applies)
Inserts into SOAS if any token is new
Inserts into URD with confidence=1.0, source=gutask
On collision: enters resolution queue with priority flag; gutask-source conflicts are reviewed first at /conflicts

Festinger /iknowthat endpoint: accepts POST with JSON body {fact: string}. Parses, tokenises, writes to SOAS/URD, returns the inserted or conflicted result. This decouples gutasktool from the Festinger schema.

This command is the agent's direct interface to the world model. When the recollection block surfaces a zero-hit concept, gutask iknowthat is the prescribed response.

The IN Relation and ISA/ISPART

Why a single operator

ISA and ISPART are both instances of a more general semantic containment relation. The dimension carries the semantic weight:

gnommoweb IN repo (dimension: type) → gnommoweb ISA repo
gnommoweb IN Glitch-University (dimension: membership) → gnommoweb ISPART Glitch-University
State IN Country (dimension: type) → State ISA a country-level granularity

The type dimension IS the ISA relation. Every other dimension IS an ISPART relation scoped to that domain. One table, one index, one operator.

Why this resolves the bleed

The classic bleed case: Michigan ISA State, Michigan ISPART USA, State ISPART Country.

With the single IN operator and dimensions:

Michigan  IN  State        (dimension: type)       is_isa: true
Michigan  IN  USA          (dimension: geography)   is_isa: false
State     IN  Country      (dimension: type)        is_isa: true
USA       IN  Country      (dimension: type)        is_isa: true

Two coherent chains, no collision, no ambiguity. "State ISPART Country" (class-level generalisation) becomes "State IN Country in the type dimension" — a perfectly valid ISA statement: State is a kind of country-level subdivision.

Collision Semantics — The `is_isa` Flag

The unique index on (concept_id, dimension_id) fires when a concept already has a parent in a given dimension. The is_isa flag on both the existing and incoming rows determines what the collision means:

Existing	Incoming	Interpretation	Action
ISA	ISA	Dimension too coarse — both facts simultaneously true about the concept's nature	Trigger dimension decomposition
ISPART	ISPART	Factual contradiction — thing can only be in one place per dimension	Trigger arbitration (which is correct?)
ISA	ISPART	Dimension misclassification — these should have been in different dimensions	Flag as misclassification, suggest correct dimension

Dimension Decomposition — Dimensions as Evolving Vocabulary

When an ISA+ISA collision occurs, the dimension is too coarse to hold both simultaneously-true facts. Example:

Existing: gnommoweb IN container (dimension: type, is_isa: true)
Incoming: gnommoweb IN repo (dimension: type, is_isa: true)

Both are true. The type dimension cannot hold them both. The conflict resolver sends this prompt to the cloud LLM:

"gnommoweb is already container in dimension type. New fact also places gnommoweb as repo in type. If both are simultaneously true, propose two more specific dimension names to replace type — one where gnommoweb as container remains valid, one where gnommoweb as repo is valid. Return JSON: {"existing_dimension": "...", "new_dimension": "..."}. Choose from this taxonomy where possible: [...]. Create new names only if nothing fits."

The LLM might return: {"existing_dimension": "deployment-type", "new_dimension": "artifact-type"}.

The system then:

Inserts deployment-type and artifact-type into SOAS (if not present)
Creates root nodes for each new dimension
Inserts the new fact under artifact-type
Leaves the existing type facts untouched — no migration

Dimensions are SOAS concepts. New dimensions emerge from the same token vocabulary. The graph grows its own taxonomy under pressure from real contradictions, starting coarse and decomposing on demand.

The Two Memory Operations

Tokenisation Rules

All text — prompts, system messages, agent outputs — passes through the same tokeniser before any saliency or relationship work is done.

Token extraction:

Split on whitespace and punctuation boundaries
Compound token rule: scan for runs of consecutive tokens where each begins with a capital letter. Merge the run into a single token, joined with underscores, then lowercase. This canonicalises proper nouns and multi-word concepts into single SOAS entries.
- Glitch University → glitch_university
- Agent Zero → agent_zero
- New York City → new_york_city
- A lowercase or short token breaks the run: the Glitch University → the breaks the run, Glitch University → glitch_university
Lowercase all tokens
Keep tokens with 5 or more characters (strictly >4); discard shorter tokens unless they are part of a matched relationship cue pattern (see below)
Strip leading/trailing punctuation from each token

Example:
"gnommoweb is a repo of Glitch University"
→ tokens: gnommoweb, repo, glitch_university
→ relationship extracted: gnommoweb IN repo IN dim:glitch_university (is_isa=true, via "is a … of" pattern)

Relationship Cue Patterns

Certain keyword patterns in intercepted text are direct cues to the memory layer that a semantic relationship is being expressed. The middleman scans every prompt for these patterns. When matched, the relationship is extracted and queued for insertion into the IN table — bypassing the saliency threshold, since the relationship has been made explicit.

Agents and humans interacting with Agent0 should be aware that using these patterns causes the middleman to build or update the world model.

ISA patterns (is_isa = true) — the subject is a type or instance of the object:

Pattern	Example
`{X} is a {Y}`	gnommoweb is a repo
`{X} is an {Y}`	gnommoweb is an API
`{X} ISA {Y}`	gnommoweb ISA repo
`{X} is a kind of {Y}`	State is a kind of region
`{X} is a type of {Y}`	gnommoweb is a type of service
`{X} is an instance of {Y}`	dobby is an instance of agent
`{X} kind of {Y}`	gnommoweb kind of repo
`{X} type of {Y}`	gnommoweb type of service
`{X} instance of {Y}`	dobby instance of agent

ISPART patterns (is_isa = false) — the subject is a member, part, or component of the object:

Pattern	Example
`{X} is part of {Y}`	gnommoweb is part of Glitch University
`{X} ISPART {Y}`	gnommoweb ISPART glitch_university
`{X} part of {Y}`	gnommoweb part of Agent0
`{X} belongs to {Y}`	gnommoweb belongs to Glitch University
`{X} is owned by {Y}`	gnommoweb is owned by jenstandstad
`{X} owned by {Y}`	gnommoweb owned by jenstandstad
`{X} member of {Y}`	dobby member of agent_pool
`{X} is a member of {Y}`	dobby is a member of agent_pool
`{X} runs on {Y}`	gnommoweb runs on Docker
`{X} hosted by {Y}`	gnommoweb hosted by ramanujan
`{X} deployed on {Y}`	gnommoweb deployed on Docker
`{X} contained in {Y}`	gnommoweb contained in agent0_stack

The of {Z} dimension modifier:

When an ISA pattern is followed by of {Z}, the named entity {Z} becomes the dimension for the extracted edge. This allows natural language to directly specify the context in which a classification holds:

"gnommoweb is a repo of Glitch University"
→ gnommoweb IN repo IN dim:glitch_university  (is_isa=true)

"Michigan is a state of USA"
→ michigan IN state IN dim:usa  (is_isa=true)

Without the of {Z} modifier, the dimension defaults to type for ISA patterns and the most appropriate seed dimension for ISPART patterns (inferred by the cloud LLM during the write step, or defaulting to membership).

Cue-triggered writes bypass the saliency threshold. Explicit relationship cues are treated as high-confidence signals regardless of how many times a concept has previously appeared. The extracted triple goes directly into the write queue with source: inferred and a confidence score assigned by the pattern type (exact keyword cues score higher than positional inference).

Writing — cloud-triggered, async

When a concept's saliency crosses the write threshold, Middleman queues it for background processing (cloud LLM calls must not block the prompt response path):

Call cloud LLM: "What is {concept}?" with the current dimension taxonomy as a closed list and a structured output prompt requesting (concept, parent, dimension, is_isa) triples
Parse response into IN table INSERT statements
Attempt inserts; route constraint violations to the conflict resolver
Update novelty score in SOAS for the concept

Reading — spontaneous prompt enrichment

On every intercepted prompt:

Tokenise the full prompt string, extract tokens >4 chars, normalise
Look up each token in SOAS, update encounter counts and saliency
For tokens above the read threshold, query the IN table for all edges involving the concept
Traverse each chain upward (configurable max depth)
Format as a <recollection> block and prepend to the prompt before forwarding to Ollama

Example output:

<recollection>
gnommoweb: [type] repo → software-artifact
           [membership] Glitch-University → Agent0-infrastructure
           [tech] FastAPI → Python
glitch.university: [type] platform → web-service
                   [membership] Agent0 → Glitch-Hunter-project
</recollection>

Only edges above the confidence threshold and within the recency window are included. The agent does not search for these — they appear spontaneously.

Read and write thresholds are separate and independently tunable. Reading (DB lookup) is cheap; writing (cloud LLM call) is expensive. The write threshold should be meaningfully higher than the read threshold.

Conflict Resolution — The Nightly Therapy Model

The IN table is rigorous and autistic: it cannot hold contradictions. Any collision is immediately routed to the resolution queue — a separate table — where it waits for the nightly resolution job to process it. During this period the world model stands unchanged; the old fact continues to be served in recollections, marked with a ? to signal pending dissonance.

On collision (immediate, synchronous)

Classify the collision type by reading is_isa on both rows: ISA+ISA, ISPART+ISPART, or misclassification
Insert the rejected fact into the resolution queue with full context: existing edge, incoming edge, dimension, collision type, timestamp
Return normally — the proxy response is not blocked

During the day (recollection engine)

Concepts with entries in the resolution queue are rendered with a ? marker:

<recollection>
gnommoweb: [type] container
           [type?] repo — pending resolution
</recollection>

The agent sees that a fact is contested. The world model is not wrong — it is incomplete. The ? marker disappears after the nightly job resolves or dismisses the conflict.

Nightly resolution job

Runs as a background thread inside the Festinger proxy process on the schedule set in the config table (resolution_schedule). Can also be triggered manually via POST /resolve/run — a corresponding button is exposed in the Festinger admin UI. Uses the model configured in resolve_model_id.

For each item in the queue, calls the configured LLM with both facts and the collision type, receives a structured decision, and applies it:

ISA+ISA collision (dimension too coarse):

LLM outcome A — decompose: suggest two new dimension names. System creates new SOAS entries and root nodes, inserts both facts into their respective new dimensions, marks queue item resolved.
LLM outcome B — dismiss: the incoming fact is noise or wrong. Queue item marked dismissed. World model stands.

ISPART+ISPART collision (factual contradiction):

LLM outcome A — update: the incoming fact is more current. System removes old IN edge, inserts new one, marks queue item resolved.
LLM outcome B — dismiss: the existing fact is still correct. Queue item marked dismissed.

Misclassification (ISA+ISPART in same dimension):

LLM suggests the correct dimension for the incoming fact. System inserts it in the corrected dimension (no collision), marks queue item resolved.

Resolution queue schema

Column	Type	Notes
id	INT PK	auto-increment
concept_id	INT FK	references SOAS
existing_parent_id	INT FK	the parent currently in the IN table
incoming_parent_id	INT FK	the rejected parent
dimension_id	INT FK	the dimension where the collision occurred
collision_type	ENUM	`isa_isa`, `ispart_ispart`, `misclassification`
status	ENUM	`pending`, `resolved`, `dismissed`
resolution	TEXT	JSON record of what the nightly job decided and did
created_at	TIMESTAMP	when the collision occurred
resolved_at	TIMESTAMP	when the nightly job processed it

Properties of this model

IN table is always consistent — agents never receive contradictory recollections from confirmed facts
Resolution is deliberate, not reactive — the nightly job processes dissonance calmly, with full context, not under prompt-response time pressure
Dismissal is a first-class outcome — not every collision is a real problem; the LLM can decide the world model is correct and the incoming fact was noise
The ? marker is the fuzziness adjunct — it surfaces uncertainty to agents without compromising the graph's integrity
/conflicts endpoint — exposes the full queue (pending and recently resolved) for human inspection and override

Graph Properties

Acyclic: the unique index on (concept_id, dimension_id) enforces single-parent per concept per dimension, making each dimension a forest of trees — structurally acyclic without any runtime check
Shared: one graph for all agents — recollections represent shared facts about the project, not per-agent beliefs. Agents already have per-agent memory in Agent Zero's own FAISS layer
Contradiction-resistant: the index makes it structurally impossible to store conflicting facts in the same dimension. Contradictions surface as insert failures, not silent overwrites
Self-organising: dimension taxonomy starts with a small seed list and decomposes on demand. No human needs to pre-define the full ontology

Saliency Decay

Two decay mechanisms, operating independently:

SOAS recency (last_seen timestamp): concepts not seen for a long time are deprioritised for recollection injection but not deleted. If gnommoweb reappears in a prompt, last_seen updates and its recollections resurface immediately
IN edge recency (last_confirmed timestamp): edges not corroborated by recent prompts are given lower weight in recollection injection, making the recollection block favour currently-relevant facts

Decay does not delete knowledge — it adjusts injection priority. The graph remains intact.

Dimension Taxonomy — Seed List

A small, orthogonal set of initial dimensions. Each dimension answers a specific question about a concept. New dimensions emerge through decomposition; this list is the starting skeleton.

Dimension	Question	is_isa
`type`	What kind of thing is this?	true
`membership`	What system or project does this belong to?	false
`runs-on`	What infrastructure hosts or executes this?	false
`tech`	What technology stack is it built with?	false
`owned-by`	Who is responsible for this?	false
`geography`	Where is this spatially or organisationally located?	false

Root nodes for each dimension are seeded at bootstrap. The type dimension is expected to be the first to decompose as domain-specific concepts accumulate.

Components

Component	Description	State
Proxy core	FastAPI Ollama-compatible HTTP proxy	✅ built
Loop detector	Session-scoped repeat detection + mitigations	✅ built
Config system	Hot-reloading YAML config	✅ built
SOAS store	Concept vocabulary + saliency DB table	✅ built
IN table store	Acyclic concept graph with correct indexes	✅ built
Dictionary bootstrap	Pre-seed SOAS with common English at saliency 0	✅ built
Dimension bootstrap	Seed root nodes for initial dimension taxonomy	✅ built
Saliency engine	Tokenise prompt, score tokens, update SOAS counts	✅ built
Recollection engine	Query IN table, traverse chains, format + inject block	✅ built
Memory writer	Write-threshold trigger → async cloud LLM → NL→IN parse → insert	✅ built
Conflict resolver	On collision: classify type, insert into resolution queue immediately	✅ built
Resolution queue	Pending/resolved/dismissed conflicts with full context	✅ built
Nightly resolution job	Drain queue via cloud LLM; apply decompose/update/dismiss decisions	✅ built
`/conflicts` endpoint	Expose queue (pending + recent) for human inspection and override	✅ built
Persistence	Postgres from day one; English dictionary pre-loaded into SOAS at init	✅ built

Task Breakdown

Phase 1 — Foundation (complete)

T01 Proxy core: FastAPI Ollama-compatible server forwarding /api/chat and /api/generate
T02 Loop detector: session-scoped exact-match repetition detection
T03 Mitigations: temperature boost, forbidden action injection, history truncation, circuit breaker
T04 Hot-reload YAML config
T05 Docker container + docker-compose service entry

Phase 2 — Persistence Layer

T06 Postgres service: add postgres container to docker-compose; connection config in config.yaml
T07 models table: id SERIAL PK, provider VARCHAR, model_name VARCHAR, api_key VARCHAR, created_at TIMESTAMPTZ
T08 config table: key VARCHAR PK, value TEXT, updated_at TIMESTAMPTZ; seed with default values for all config keys
T09 SOAS table: id SERIAL PK, token VARCHAR UNIQUE, encounter_count INT default 0, last_seen TIMESTAMPTZ, saliency FLOAT default 0, novelty FLOAT default 0. All tokens lowercase. Unique index on token.
T10 URD table: id INT FK → soas, parent_id INT FK → soas, dim_id INT FK → soas, is_isa BOOLEAN, confidence FLOAT, last_confirmed TIMESTAMPTZ, source VARCHAR. PK (id, parent_id, dim_id). Unique index (id, dim_id).
T11 Resolution queue table: id SERIAL PK, concept_id INT FK → soas, existing_parent_id INT FK → soas, incoming_parent_id INT FK → soas, dim_id INT FK → soas, collision_type VARCHAR, status VARCHAR default 'pending', resolution JSONB, created_at TIMESTAMPTZ, resolved_at TIMESTAMPTZ
T12 English dictionary bootstrap: bulk-load word list into SOAS with saliency=0, novelty=0, encounter_count=0 on container init. Skip existing tokens.
T13 Dimension bootstrap: insert SOAS entries and self-referential URD root nodes (id = parent_id = dim_id) for the 6 seed dimensions

Phase 3 — Saliency Engine and Prompt Parsing

T14 Tokeniser: split on whitespace/punctuation; apply compound token rule (consecutive capitalised tokens → single underscore-joined lowercase token); no minimum length; strip punctuation; lowercase all
T15 Relationship cue scanner: regex/pattern scan for ISA and ISPART cue patterns; extract (subject, parent, dimension_modifier, is_isa) triples; handle of {Z} dimension modifier
T16 SOAS lookup + update: increment encounter_count, update last_seen, recalculate saliency (log scale) for all extracted tokens; read thresholds from config table
T17 Threshold evaluation: read saliency_read_threshold and saliency_write_threshold from config; cue-extracted triples bypass write-threshold entirely

Phase 4 — Recollection Engine (Read Path)

T16 URD query: for each above-read-threshold token, execute the canonical recollection query; filter by confidence floor and last_confirmed recency window
T17 Recollection formatter — hit path: enumerate query rows; group by concept; render each edge as [dim_token] parent_token; append ? for edges with a pending resolution queue entry
T18 Recollection formatter — zero-hit path: for salient concepts with no URD rows, emit the ? concept: no recollection prompt block including the gutask iknowthat usage hint
T19 Prompt injection: prepend recollection block before forwarding to Ollama
T20 Recollection config: max concepts per block, confidence floor, recency window, injection position

Phase 5 — Memory Writer (Write Path)

T21 POST /iknowthat endpoint: accept {fact: string}, parse -isa/-ispart flags and in context of clause, run tokeniser, upsert SOAS, insert into URD with confidence=1.0, source=gutask; route collisions to resolution queue with priority flag
T22 gutask iknowthat command (in /Users/jenstandstad/Projects/gutasktool): parse fact string, POST to Festinger /iknowthat, surface result to agent
T23 Write queue: async background queue for concepts crossing the write threshold; cue-extracted triples enter directly regardless of threshold
T24 LLM client: support claude and openai providers; load provider/model/key from models table via write_model_id config key; structured prompt requesting (concept, parent, dimension, is_isa, confidence) triples as JSON
T25 NL→IN parser: validate triples against SOAS and known dimensions; create new SOAS entries for unknown tokens; apply compound token rule
T26 URD insert pipeline: check urd_by_concept_dim in-memory first; on miss attempt Postgres insert; on hit or UniqueViolation route to conflict resolver; set source field per write path

Phase 6 — Conflict Resolution

T25 Collision handler: on unique constraint violation, classify type via is_isa flags, insert immediately into resolution queue
T26 Recollection engine update: check resolution queue for pending items per concept; render pending edges with [dim?] marker
T27 Nightly resolution job: background thread, schedule from config table (resolution_schedule); for each pending queue item, call LLM configured in resolve_model_id with both facts and collision type, receive JSON decision (decompose / update / dismiss)
T28 Resolution applicator — decompose: create new dimension SOAS entries and root nodes; insert both facts in respective new dimensions; mark queue item resolved
T29 Resolution applicator — update: remove old IN edge, insert new fact, mark queue item resolved
T30 Resolution applicator — dismiss: mark queue item dismissed; world model unchanged; the [dim?] marker disappears from recollections
T31 /conflicts endpoint: list pending and recently resolved/dismissed items with full context; support human override (force-dismiss, force-resolve)
T32 POST /resolve/run endpoint: manually trigger the nightly resolution job outside of its schedule
T33 Admin UI: minimal HTML page served by Festinger at /admin; shows pending conflicts count, last resolution run timestamp, and a "Run resolution now" button wired to POST /resolve/run

Phase 7 — Hardening

T32 Latency guard: tokenisation + saliency lookup + recollection query must not add >50ms to prompt round-trip; use connection pooling
T33 Write path fully async: all cloud LLM calls, IN inserts, and queue operations run in background workers; proxy response never waits
T34 Integration test: plain prompt → compound token extraction → saliency update → recollection injection → Ollama round trip
T35 Integration test: cue pattern in prompt ("gnommoweb is a repo of Glitch University") → extracted triple bypasses threshold → IN insert → recollection on next prompt
T36 Integration test: write threshold → cloud LLM write → IN insert → recollection on next prompt
T37 Integration test: ISA+ISA collision → immediate queue insert → [type?] marker in recollection → nightly job → decompose → clean recollection across two dimensions
T38 Integration test: ISPART+ISPART collision → queue insert → nightly job → dismiss → world model unchanged, marker gone

Resolved Design Decisions

Question	Decision
Single relation or ISA+ISPART tables?	Single IN operator; `is_isa` boolean flag outside the index annotates edge type
ISA vs ISPART bleed	Resolved by dimension: `type` dimension = ISA; all other dimensions = ISPART
Shared vs per-agent graph	Shared — recollections are project-wide facts; per-agent memory remains in Agent Zero's FAISS layer
Saliency decay	Dual mechanism: SOAS `last_seen` for concept recency; IN `last_confirmed` for edge recency. Decay adjusts injection priority, never deletes knowledge
Recollection depth	Depth = 1, no chain traversal. Single flat query against URD for direct edges of each salient concept. Property-loop problem is dissolved by design — self-referential chains cannot arise without traversal.
Write path blocking	Fully async — cloud LLM calls queued in background, never block proxy response
Dimension taxonomy	Seed list of 6 orthogonal dimensions; decomposes on demand via nightly resolution job when ISA+ISA collision is queued
Collision semantics	ISA+ISA → dimension too coarse → decompose. ISPART+ISPART → factual contradiction → arbitrate. ISA+ISPART → misclassification → flag
Contradiction resistance	Structural — unique index on `(concept_id, dimension_id)` makes conflicting facts physically uninsertable in the same dimension
New dimensions	Emerge as SOAS tokens; no schema change needed; root nodes created at decomposition time
Conflict resolution timing	Immediate queue insert on collision; nightly job drains the queue via cloud LLM; outcomes are decompose / update / dismiss
Database	Postgres from day one — no SQLite POC. English dictionary bulk-loaded into SOAS at container init.
Tokenisation	Tokens ≥5 chars, lowercased. Consecutive capitalised tokens merged into single underscore-joined token (`Glitch University` → `glitch_university`).
Relationship cue parsing	ISA and ISPART keyword patterns in intercepted text trigger direct triple extraction, bypassing saliency threshold. `of {Z}` modifier sets the dimension.
Cue-triggered writes	Explicit cue patterns are high-confidence signals; extracted triples go straight to the write queue with `source: inferred`, no threshold gate.
Zero-hit recollection	Salient concepts with no URD entries emit a `? concept: no recollection` prompt block instructing the agent to clarify or store the fact via `gutask iknowthat`.
Manual write path	`gutask iknowthat 'X -isa/-ispart Y in context of Z'` inserts directly into URD with `confidence=1.0, source=manual`, bypassing saliency threshold and cloud LLM.
In-memory layer	SOAS and URD cached in Python dicts at startup. Reads are zero-network. IDs always originate from Postgres. Collision detection uses `dict[(concept_id, dim_id)]` mirroring the Postgres unique index. Postgres is the safety net for race conditions.
Token length	No minimum — all tokens indexed. Frequency and novelty determine whether they surface. Optimize later if needed.
Nightly job execution	Background thread inside proxy process. Triggered by cron schedule (config table) or manually via `POST /resolve/run` + admin UI button.
Recollection injection	Prepended to existing system message content. If no system message exists, a new one is inserted at position 0. System message position provides the strongest grounding anchor for instruction-tuned models.
LLM configuration	`models` table (provider, model_name, api_key). `config` table keys `write_model_id` and `resolve_model_id` select which model each purpose uses. Supports `claude` and `openai` providers.
Source values	`cloud_llm` (saliency write path), `inferred` (cue pattern extraction), `festinger` (nightly resolution job), `gutask` (gutask iknowthat command).
gutask iknowthat interface	gutasktool at `/Users/jenstandstad/Projects/gutasktool`. Command POSTs to Festinger's `/iknowthat` endpoint — no direct Postgres access from gutasktool.

Test Cases

The in-memory cache layer enables full unit testing without a live database. Tests pre-populate soas_by_token, soas_by_id, urd_by_concept, urd_by_concept_dim, and pending_conflicts directly, then exercise the logic under test and assert on the resulting state.

Test A — Prompt includes a concept not in the cache

Scenario: An agent sends a prompt referencing gnommoweb. The concept exists nowhere in SOAS or URD.

Setup:

soas_by_token = {}   # empty — concept is completely unknown
urd_by_concept = {}
urd_by_concept_dim = {}
pending_conflicts = set()

Input prompt: "Please update gnommoweb to use FastAPI instead"

Expected behaviour:

Tokeniser extracts: gnommoweb (7 chars ✓), fastapi (7 chars ✓), please (6 chars but common English → saliency 0), update (6 chars, common → saliency 0), instead (7 chars, common → saliency 0)
gnommoweb and fastapi not in soas_by_token → both are new tokens. In a test with Postgres mocked, the mock returns id=101 for gnommoweb and id=102 for fastapi. Both added to SOAS dicts.
Saliency for both: log(1) — first encounter, below read threshold
URD lookup: skipped (below threshold) — no recollection block emitted for these concepts

Edge case variant — above threshold: pre-seed soas_by_token["gnommoweb"] with encounter_count=50, saliency=0.9 (above read threshold) but keep urd_by_concept empty.

Expected behaviour (variant):

Saliency lookup: above read threshold ✓
URD lookup: urd_by_concept.get(101, []) → empty list
Zero-hit path triggered
Recollection block contains:

? gnommoweb: no recollection. If not a typo, store it before proceeding:
  gutask iknowthat 'gnommoweb -isa <parent> in context of <dimension>'
  gutask iknowthat 'gnommoweb -ispart <system> in context of <dimension>'

Assertions:

pending_conflicts unchanged (no collision occurred)
Resolution queue empty
Recollection block contains ? gnommoweb
Prompt forwarded to Ollama with recollection block prepended

Test B — Prompt includes "A ISA B" conflicting with existing world model

Scenario: The world model already holds gnommoweb ISA repo in the type dimension. A new prompt contains the explicit cue "gnommoweb is a container", creating an ISA+ISA collision.

Setup:

# SOAS
soas_by_token = {
    "gnommoweb":  SoasRow(id=101, saliency=1.2, novelty=1.0),
    "repo":       SoasRow(id=201, saliency=0.8, novelty=0.5),
    "container":  SoasRow(id=202, saliency=0.6, novelty=0.4),
    "type":       SoasRow(id=1,   saliency=0.0, novelty=0.0),  # seed dimension
}
soas_by_id = {101: "gnommoweb", 201: "repo", 202: "container", 1: "type"}

# URD — gnommoweb ISA repo IN type (existing confirmed fact)
existing_edge = UrdEdge(concept_id=101, parent_id=201, dim_id=1,
                        is_isa=True, confidence=0.9, source="cloud_llm")
urd_by_concept     = {101: [existing_edge]}
urd_by_concept_dim = {(101, 1): existing_edge}
pending_conflicts  = set()

Input prompt: "gnommoweb is a container deployed on Docker"

Expected behaviour:

Cue scanner matches "is a" pattern: extracts triple (gnommoweb, container, type, is_isa=True)
Collision check: (101, 1) found in urd_by_concept_dim → collision
Existing edge is_isa=True, incoming is_isa=True → ISA+ISA collision type
Resolution queue insert (mocked Postgres): {concept_id=101, existing_parent_id=201, incoming_parent_id=202, dim_id=1, collision_type="isa_isa", status="pending"}
pending_conflicts.add(101)
No URD modification — world model unchanged

Recollection block for this prompt:

<recollection>
gnommoweb: [type?] repo — conflict pending
</recollection>

Assertions:

urd_by_concept_dim[(101, 1)] still points to the original repo edge (unchanged)
pending_conflicts == {101}
Resolution queue has exactly one entry with collision_type="isa_isa" and status="pending"
No Postgres URD insert attempted
Recollection renders [type?] marker for gnommoweb

Test C — Nightly job: queue processing and dimension decomposition

Scenario: The resolution queue contains the ISA+ISA collision from Test B. The nightly job runs, calls the cloud LLM (mocked), receives a decomposition decision, and updates the world model.

Setup: state as at end of Test B, plus:

resolution_queue = [
    QueueEntry(id=1, concept_id=101, existing_parent_id=201, incoming_parent_id=202,
               dim_id=1, collision_type="isa_isa", status="pending")
]

Mocked cloud LLM response:

{
  "decision": "decompose",
  "existing_dimension": "artifact-type",
  "new_dimension": "deployment-type",
  "reasoning": "repo describes what gnommoweb is as a software artifact; container describes how it is deployed"
}

Expected behaviour:

Nightly job fetches pending queue entries from Postgres
For entry id=1, calls cloud LLM → receives decompose decision
New dimensions: insert artifact-type into Postgres SOAS → returns id=401; insert deployment-type → returns id=402. Add both to SOAS dicts.
Root nodes: insert (401,401,401) and (402,402,402) into Postgres URD (self-referential dimension roots)
Migrate existing edge: delete (101, 201, 1) from URD, insert (101, 201, 401) — gnommoweb ISA repo IN artifact-type
Insert new edge: insert (101, 202, 402) — gnommoweb ISA container IN deployment-type
Update in-memory cache:
- soas_by_token["artifact-type"] = SoasRow(id=401, ...)
- soas_by_token["deployment-type"] = SoasRow(id=402, ...)
- Remove urd_by_concept_dim[(101, 1)], add [(101, 401)] and [(101, 402)]
- Update urd_by_concept[101] to two new edges
- pending_conflicts.discard(101)
Mark queue entry resolved with resolution JSON
Signal proxy /reload (or nightly job updates cache directly if in-process)

Final recollection block for gnommoweb:

<recollection>
gnommoweb: [artifact-type] repo  [deployment-type] container
</recollection>

Assertions:

urd_by_concept_dim contains (101, 401) and (101, 402), not (101, 1)
urd_by_concept[101] has exactly two edges
pending_conflicts does not contain 101
soas_by_token contains artifact-type and deployment-type with ids from Postgres
Resolution queue entry has status="resolved" and non-null resolution JSON
No [type?] marker in subsequent recollection for gnommoweb

49 KiB Raw Blame History Unescape Escape

Festinger — Agent0 Inference Middleware

Purpose

Architecture

Data Model

models — LLM provider configuration

config — runtime configuration

SOAS — concept vocabulary and saliency

URD table — the acyclic concept graph

In-Memory Cache Layer

Canonical Recollection Query

gutask iknowthat — Manual Write Path

The IN Relation and ISA/ISPART

Why a single operator

Why this resolves the bleed

Collision Semantics — The is_isa Flag

Dimension Decomposition — Dimensions as Evolving Vocabulary

The Two Memory Operations

Tokenisation Rules

Relationship Cue Patterns

Writing — cloud-triggered, async

Reading — spontaneous prompt enrichment

Conflict Resolution — The Nightly Therapy Model

On collision (immediate, synchronous)

During the day (recollection engine)

Nightly resolution job

Resolution queue schema

Properties of this model

Graph Properties

Saliency Decay

Dimension Taxonomy — Seed List

Components

Task Breakdown

Phase 1 — Foundation (complete)

Phase 2 — Persistence Layer

Phase 3 — Saliency Engine and Prompt Parsing

Phase 4 — Recollection Engine (Read Path)

Phase 5 — Memory Writer (Write Path)

Phase 6 — Conflict Resolution

Phase 7 — Hardening

Resolved Design Decisions

Test Cases

Test A — Prompt includes a concept not in the cache

Test B — Prompt includes "A ISA B" conflicting with existing world model

Test C — Nightly job: queue processing and dimension decomposition

49 KiB

Raw Blame History

Collision Semantics — The `is_isa` Flag