ADR-0051: Agent Memory — Persistent Shared Memory for AI Agents¶

Status: Accepted
Date: 2026-04-18
Depends on: ADR-0009 (Provenance Always On), ADR-0011 (DID Identity), ADR-0018 (Agent Runtime), ADR-0023 (Federation), ADR-0037 (Hypothesis-Driven Loops)
Target milestone: M10

Context¶

AI agents today suffer from persistent amnesia. Every session starts from scratch — the agent re-discovers project structure, re-learns user preferences, and re-derives conclusions it already reached yesterday. Current approaches to agent memory are inadequate:

Approach	Problem
Markdown files (CLAUDE.md, MEMORY.md)	Flat text, no relational queries, no provenance, local to one agent
Vector databases	Embedding similarity only — no structured reasoning, no temporal queries, no trust chain
Conversation history	Dies with the session, can't be shared, fills context windows
System prompts	Static, no learning, no cross-agent sharing

The root issue: memory is treated as a file, not as a service. What agents need is a persistent, structured, queryable, shareable knowledge store with provenance — a brain.

Trails already has every primitive required: - Knowledge graph (Oxigraph) for structured relational storage - PROV-O (ADR-0009) for tracking who learned what and when - Temporal queries for "what did we know at time T?" - DIDs (ADR-0011) for agent identity - Federation (ADR-0023) for cross-instance knowledge sharing - Cedar policies (ADR-0006) for access control on knowledge - MCP surface for tool exposure to any MCP-capable client - Cost envelopes (ADR-0012) for tracking the cost of learning

What's missing is an opinionated memory protocol — a set of capabilities and an ontology that turns the raw graph into a usable agent memory.

Use cases driving this ADR¶

Cross-agent continuity: Claude Code learns a fact in the morning; Copilot uses it in the afternoon
Multi-agent collaboration: Agent A and Agent B working on the same codebase share a single knowledge store instead of diverging
Federated knowledge: A manufacturer's agent and a notified body's agent share regulatory knowledge across organizational boundaries (submission-access use case)
Auditable reasoning: Every fact has a provenance chain — which agent learned it, from what source, with what confidence
Temporal reasoning: "What requirements changed since the NB last reviewed?" is a graph query, not a log grep

Decision¶

1. Brain Ontology¶

Introduce a brain: namespace (https://trails.dev/ns/brain/v1#) with the following classes:

@prefix brain: <https://trails.dev/ns/brain/v1#> .
@prefix prov:  <http://www.w3.org/ns/prov#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .

# ── Core classes ─────────────────────────────────────────

brain:Fact a owl:Class ;
    rdfs:subClassOf prov:Entity ;
    rdfs:comment "A learned piece of knowledge with provenance and confidence." .

brain:Correction a owl:Class ;
    rdfs:subClassOf prov:Entity ;
    rdfs:comment "A fact that supersedes a previous fact (retraction or update)." .

brain:Link a owl:Class ;
    rdfs:subClassOf prov:Entity ;
    rdfs:comment "An explicit relationship between two facts." .

brain:Topic a owl:Class ;
    rdfs:comment "A subject area for organizing facts." .

brain:Source a owl:Class ;
    rdfs:subClassOf prov:Entity ;
    rdfs:comment "Origin of a fact: file, URL, conversation, observation." .

# ── Properties ───────────────────────────────────────────

brain:content a owl:DatatypeProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range xsd:string ;
    rdfs:comment "The natural-language content of the fact." .

brain:confidence a owl:DatatypeProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range xsd:float ;
    rdfs:comment "Confidence score [0.0, 1.0]." .

brain:topic a owl:ObjectProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range brain:Topic .

brain:source a owl:ObjectProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range brain:Source .

brain:supersedes a owl:ObjectProperty ;
    rdfs:domain brain:Correction ;
    rdfs:range brain:Fact ;
    rdfs:comment "The prior fact this correction replaces." .

brain:relatedTo a owl:ObjectProperty ;
    rdfs:domain brain:Link ;
    rdfs:range brain:Fact .

brain:relationship a owl:DatatypeProperty ;
    rdfs:domain brain:Link ;
    rdfs:range xsd:string ;
    rdfs:comment "Nature of the link: causes, requires, contradicts, supports, etc." .

brain:scope a owl:DatatypeProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range xsd:string ;
    rdfs:comment "Visibility scope: local (this instance), shared (federated), private (agent-only)." .

brain:staleness a owl:DatatypeProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range xsd:string ;
    rdfs:comment "Decay policy: stable, hourly, daily, weekly, volatile." .

2. Fact Lifecycle¶

                    memory.learn
                        │
                        ▼
    ┌─────────────────────────────────┐
    │         brain:Fact              │
    │  content, confidence, topic,   │
    │  source, scope, staleness      │
    │  prov:wasAttributedTo (DID)    │
    │  prov:generatedAtTime          │
    ├─────────────────────────────────┤
    │                                 │
    │  memory.correct ──► Correction   │
    │  (supersedes original,          │
    │   records reason + new value)   │
    │                                 │
    │  memory.forget ──► Retraction    │
    │  (marks as retracted, keeps     │
    │   audit trail, never hard-      │
    │   deletes)                      │
    │                                 │
    │  memory.link ──► Link            │
    │  (typed relationship to         │
    │   another fact or external IRI) │
    │                                 │
    └─────────────────────────────────┘

Key principle: facts are never hard-deleted. A memory.forget marks the fact as retracted with a prov:wasInvalidatedBy trail. This ensures the audit chain is complete — you can always answer "what did the agent believe at time T, and why did it stop believing it?"

3. Capability Surface (10 MCP Tools)¶

Write capabilities¶

Capability	Description	Key parameters
`memory.learn`	Store a new fact with provenance	`content`, `confidence`, `topic`, `source`, `scope`, `staleness`
`memory.correct`	Supersede a previous fact	`fact_iri`, `new_content`, `reason`, `new_confidence`
`memory.forget`	Retract a fact (soft delete)	`fact_iri`, `reason`
`memory.link`	Create typed relationship between facts	`from_iri`, `to_iri`, `relationship`

Read capabilities¶

Capability	Description	Key parameters
`memory.recall`	Context-aware retrieval	`context`, `topic`, `min_confidence`, `max_age`, `limit`
`memory.query`	Structured SPARQL query	`sparql`
`memory.explain`	Provenance chain for a fact	`fact_iri`, `depth`
`memory.diff`	What changed since timestamp?	`since`, `topic`, `include_corrections`

Meta capabilities¶

Capability	Description	Key parameters
`memory.status`	Graph stats: size, staleness, agents	—
`memory.compact`	Merge low-confidence, prune stale	`min_confidence`, `max_staleness`, `dry_run`

4. Recall Strategy¶

memory.recall is the primary read path. It combines multiple signals, not just text similarity:

Score(fact) = w₁ · text_relevance(context, fact.content)
            + w₂ · topic_match(query_topic, fact.topic)
            + w₃ · recency(fact.generatedAtTime)
            + w₄ · confidence(fact.confidence)
            + w₅ · graph_proximity(context_iris, fact)

Phase 1 (this ADR): ORM-based filtering with Q combinators — topic match, confidence threshold, recency sort. No embedding similarity yet.

Phase 2 (future): Hybrid retrieval combining SPARQL graph traversal with vector similarity from trails.vector (ADR-0019).

Facts with scope: "shared" are visible to federated peers. The sharing model follows ADR-0023:

Instance A (Claude Code)           Instance B (Copilot)
┌──────────────────────┐    ┌──────────────────────┐
│  memory.learn(...)    │    │  memory.recall(...)    │
│  scope: "shared"     │    │                       │
│                      │    │  → local facts        │
│  /.well-known/       │◄───│  → SERVICE query to A │
│    brain/sparql      │    │  → merged results     │
└──────────────────────┘    └──────────────────────┘

Cedar policy gates all federated recall. A peer can only see facts that the local instance's policy allows. The DID of the requesting agent is the Cedar principal.

Cost attribution: Federated recall charges the requesting instance's cost envelope (ADR-0012). The serving instance meters the query cost and returns it in the response.

6. Staleness and Compaction¶

Facts carry a staleness policy:

Policy	TTL	Compact behavior
`stable`	∞	Never auto-pruned
`weekly`	7d	Flagged after 7d, pruned on compact
`daily`	24h	Flagged after 24h
`hourly`	1h	Flagged after 1h
`volatile`	0	Pruned on next compact

memory.compact performs: 1. Remove retracted facts older than retention period (default 30d) 2. Merge duplicate facts (same content, different agents) into a single fact with combined provenance 3. Prune facts below confidence threshold (default 0.3) 4. Prune stale facts past their TTL 5. Return a report of what was pruned (or would be, in dry_run mode)

7. Integration with Existing ADRs¶

ADR	Integration
ADR-0009 (Provenance)	Every `memory.learn` emits `prov:wasGeneratedBy`, `prov:wasAttributedTo`, `prov:generatedAtTime`
ADR-0011 (DID Identity)	Agent identity is the DID; facts are attributed to the learning agent's DID
ADR-0012 (Cost)	Learning and recall operations are cost-tracked; federated recall charges the requester
ADR-0018 (Agent Runtime)	Session can auto-persist learned facts to brain on session end
ADR-0023 (Federation)	Shared-scope facts queryable via `SERVICE` keyword; Cedar-gated
ADR-0037 (Hypothesis)	Hypothesis conclusions can be persisted as brain facts with confidence scores

8. MCP Configuration¶

Any MCP-capable client (Claude Code, Copilot, custom agents) connects to a Trails brain via trails.toml:

[project]
name = "my-brain"
base_iri = "https://trails.dev/brain/my-project/"

[graph]
backend = "oxigraph"
mode = "persistent"
path = ".trails/brain.db"

[brain]
default_scope = "local"          # local | shared | private
default_staleness = "weekly"     # stable | weekly | daily | hourly | volatile
compact_on_startup = false       # run compaction when server starts
max_facts = 100000               # soft limit, warns when exceeded

Claude Code settings.json:

{
  "mcpServers": {
    "trails-memory": {
      "command": "trails",
      "args": ["server", "--transport", "stdio"],
      "cwd": "/path/to/memory-instance"
    }
  }
}

9. Framework Integrations¶

The memory capabilities are transport-agnostic. Six integration adapters ship with the example:

Adapter	Target frameworks	Mechanism
MCP (built-in)	Claude Code, Copilot, Cursor, Zed	`trails server --transport stdio/sse`
Python SDK	Any Python: LlamaIndex, AutoGen, custom	`from integrations.sdk import memory`
LangChain	LangChain, LangGraph, CrewAI	`@tool` objects via `get_langchain_tools()`
OpenAI	GPT-4o, Codex, Assistants API	JSON schema via `get_openai_tools()`
Anthropic	Claude API (tool_use)	JSON schema via `get_anthropic_tools()`
HTTP	Any language (curl, Go, Rust, TS)	REST client via `BrainClient(url)`

The pattern: capabilities are defined once via @capability, then projected to each framework's native tool format through thin adapters (~100-150 LOC each). No duplication — a new capability automatically appears in all adapters.

Non-goals¶

Embedding-based similarity search — deferred to Phase 2 integration with trails.vector (ADR-0019). Phase 1 uses ORM-based filtering.
Automatic fact extraction from conversations — the agent decides what to learn; the brain doesn't eavesdrop on chat history.
Replacing CLAUDE.md or project instructions — the brain complements static instructions, not replaces them. Static rules go in CLAUDE.md; learned knowledge goes in the brain.
Multi-tenant brain — one brain instance per project/workspace. Cross-project knowledge sharing happens via federation, not shared tenancy.
Bayesian belief networks — confidence is a scalar float estimated by the learning agent, not a probabilistic graphical model.

Consequences¶

Positive¶

Agents stop being amnesiac: knowledge persists across sessions and tools
Cross-agent collaboration becomes possible: Claude Code and Copilot share a brain
Audit trail for agent knowledge: every fact has provenance — essential for regulated industries
Federation enables cross-org knowledge sharing: manufacturer and NB agents share regulatory knowledge under policy control
Progressive enhancement: start with memory.learn/memory.recall, add federation and policies when needed
Trails differentiator: no other framework offers relational + temporal + provenance + federated agent memory

Negative¶

Graph size growth: agents may over-learn, creating noise. Mitigation: staleness policies + compaction + confidence thresholds.
Recall latency: graph queries are slower than key-value lookups. Mitigation: indexed SPARQL, topic-based partitioning.
Complexity for simple use cases: a single-agent notepad doesn't need a KG. Mitigation: memory.learn/memory.recall can be used without understanding the ontology.

Neutral¶

Brain facts are regular KG nodes — they participate in SHACL validation, temporal queries, and all other Trails features without special casing.
The brain: namespace is a convention, not a hard requirement. Users can define their own memory ontology and still use the capabilities.

Revisit conditions¶

When trails.vector (ADR-0019) is stable: integrate embedding similarity into memory.recall scoring
When multi-modal nodes land: support image/PDF facts (e.g., "I saw this error screenshot")
When a second brain consumer exists beyond Claude Code: validate the federation sharing model

Alternatives considered¶

Vector-only memory (Chroma, Pinecone): Rejected — no relational queries, no provenance, no temporal reasoning, no federation. Good for similarity search, bad for structured knowledge.
Conversation log persistence: Rejected — chat history is noisy, fills up fast, and can't be queried relationally. The brain should store conclusions, not conversations.
File-based memory (current MEMORY.md approach): Rejected as sole solution — works for small projects but doesn't scale, can't be shared across agents, and has no query capability beyond grep.
SQLite/PostgreSQL tables: Rejected — loses the RDF/SPARQL advantage (federation, ontology alignment, PROV-O integration). Would require reimplementing everything Trails already provides.

Open questions¶

Q: Should memory.recall support natural-language queries in Phase 1? Recommendation: Yes, via simple keyword extraction + Q(content__icontains=...). Full semantic search deferred to Phase 2.
Q: Should the brain auto-compact on startup? Recommendation: Configurable, default off. Users who run many short sessions may want it; long-running servers should compact on a schedule.
Q: How does the brain interact with trails.agent.Session (ADR-0018)? Recommendation: Session gets an optional brain parameter. When set, the session auto-persists key conclusions (hypotheses above confidence threshold, user corrections) to the brain on session.end().
Q: Should facts have a verified_by field for human confirmation? Recommendation: Yes, add in Phase 1. A boolean brain:humanVerified property lets agents distinguish between self-learned and human-confirmed knowledge.