ADR-0051: Agent Memory — Persistent Shared Memory for AI Agents¶
- Status: Accepted
- Date: 2026-04-18
- Depends on: ADR-0009 (Provenance Always On), ADR-0011 (DID Identity), ADR-0018 (Agent Runtime), ADR-0023 (Federation), ADR-0037 (Hypothesis-Driven Loops)
- Target milestone: M10
Context¶
AI agents today suffer from persistent amnesia. Every session starts from scratch — the agent re-discovers project structure, re-learns user preferences, and re-derives conclusions it already reached yesterday. Current approaches to agent memory are inadequate:
| Approach | Problem |
|---|---|
| Markdown files (CLAUDE.md, MEMORY.md) | Flat text, no relational queries, no provenance, local to one agent |
| Vector databases | Embedding similarity only — no structured reasoning, no temporal queries, no trust chain |
| Conversation history | Dies with the session, can't be shared, fills context windows |
| System prompts | Static, no learning, no cross-agent sharing |
The root issue: memory is treated as a file, not as a service. What agents need is a persistent, structured, queryable, shareable knowledge store with provenance — a brain.
Trails already has every primitive required: - Knowledge graph (Oxigraph) for structured relational storage - PROV-O (ADR-0009) for tracking who learned what and when - Temporal queries for "what did we know at time T?" - DIDs (ADR-0011) for agent identity - Federation (ADR-0023) for cross-instance knowledge sharing - Cedar policies (ADR-0006) for access control on knowledge - MCP surface for tool exposure to any MCP-capable client - Cost envelopes (ADR-0012) for tracking the cost of learning
What's missing is an opinionated memory protocol — a set of capabilities and an ontology that turns the raw graph into a usable agent memory.
Use cases driving this ADR¶
- Cross-agent continuity: Claude Code learns a fact in the morning; Copilot uses it in the afternoon
- Multi-agent collaboration: Agent A and Agent B working on the same codebase share a single knowledge store instead of diverging
- Federated knowledge: A manufacturer's agent and a notified body's agent share regulatory knowledge across organizational boundaries (submission-access use case)
- Auditable reasoning: Every fact has a provenance chain — which agent learned it, from what source, with what confidence
- Temporal reasoning: "What requirements changed since the NB last reviewed?" is a graph query, not a log grep
Decision¶
1. Brain Ontology¶
Introduce a brain: namespace (https://trails.dev/ns/brain/v1#) with the following classes:
@prefix brain: <https://trails.dev/ns/brain/v1#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
# ── Core classes ─────────────────────────────────────────
brain:Fact a owl:Class ;
rdfs:subClassOf prov:Entity ;
rdfs:comment "A learned piece of knowledge with provenance and confidence." .
brain:Correction a owl:Class ;
rdfs:subClassOf prov:Entity ;
rdfs:comment "A fact that supersedes a previous fact (retraction or update)." .
brain:Link a owl:Class ;
rdfs:subClassOf prov:Entity ;
rdfs:comment "An explicit relationship between two facts." .
brain:Topic a owl:Class ;
rdfs:comment "A subject area for organizing facts." .
brain:Source a owl:Class ;
rdfs:subClassOf prov:Entity ;
rdfs:comment "Origin of a fact: file, URL, conversation, observation." .
# ── Properties ───────────────────────────────────────────
brain:content a owl:DatatypeProperty ;
rdfs:domain brain:Fact ;
rdfs:range xsd:string ;
rdfs:comment "The natural-language content of the fact." .
brain:confidence a owl:DatatypeProperty ;
rdfs:domain brain:Fact ;
rdfs:range xsd:float ;
rdfs:comment "Confidence score [0.0, 1.0]." .
brain:topic a owl:ObjectProperty ;
rdfs:domain brain:Fact ;
rdfs:range brain:Topic .
brain:source a owl:ObjectProperty ;
rdfs:domain brain:Fact ;
rdfs:range brain:Source .
brain:supersedes a owl:ObjectProperty ;
rdfs:domain brain:Correction ;
rdfs:range brain:Fact ;
rdfs:comment "The prior fact this correction replaces." .
brain:relatedTo a owl:ObjectProperty ;
rdfs:domain brain:Link ;
rdfs:range brain:Fact .
brain:relationship a owl:DatatypeProperty ;
rdfs:domain brain:Link ;
rdfs:range xsd:string ;
rdfs:comment "Nature of the link: causes, requires, contradicts, supports, etc." .
brain:scope a owl:DatatypeProperty ;
rdfs:domain brain:Fact ;
rdfs:range xsd:string ;
rdfs:comment "Visibility scope: local (this instance), shared (federated), private (agent-only)." .
brain:staleness a owl:DatatypeProperty ;
rdfs:domain brain:Fact ;
rdfs:range xsd:string ;
rdfs:comment "Decay policy: stable, hourly, daily, weekly, volatile." .
2. Fact Lifecycle¶
memory.learn
│
▼
┌─────────────────────────────────┐
│ brain:Fact │
│ content, confidence, topic, │
│ source, scope, staleness │
│ prov:wasAttributedTo (DID) │
│ prov:generatedAtTime │
├─────────────────────────────────┤
│ │
│ memory.correct ──► Correction │
│ (supersedes original, │
│ records reason + new value) │
│ │
│ memory.forget ──► Retraction │
│ (marks as retracted, keeps │
│ audit trail, never hard- │
│ deletes) │
│ │
│ memory.link ──► Link │
│ (typed relationship to │
│ another fact or external IRI) │
│ │
└─────────────────────────────────┘
Key principle: facts are never hard-deleted. A memory.forget marks the fact as retracted with a prov:wasInvalidatedBy trail. This ensures the audit chain is complete — you can always answer "what did the agent believe at time T, and why did it stop believing it?"
3. Capability Surface (10 MCP Tools)¶
Write capabilities¶
| Capability | Description | Key parameters |
|---|---|---|
memory.learn |
Store a new fact with provenance | content, confidence, topic, source, scope, staleness |
memory.correct |
Supersede a previous fact | fact_iri, new_content, reason, new_confidence |
memory.forget |
Retract a fact (soft delete) | fact_iri, reason |
memory.link |
Create typed relationship between facts | from_iri, to_iri, relationship |
Read capabilities¶
| Capability | Description | Key parameters |
|---|---|---|
memory.recall |
Context-aware retrieval | context, topic, min_confidence, max_age, limit |
memory.query |
Structured SPARQL query | sparql |
memory.explain |
Provenance chain for a fact | fact_iri, depth |
memory.diff |
What changed since timestamp? | since, topic, include_corrections |
Meta capabilities¶
| Capability | Description | Key parameters |
|---|---|---|
memory.status |
Graph stats: size, staleness, agents | — |
memory.compact |
Merge low-confidence, prune stale | min_confidence, max_staleness, dry_run |
4. Recall Strategy¶
memory.recall is the primary read path. It combines multiple signals, not just text similarity:
Score(fact) = w₁ · text_relevance(context, fact.content)
+ w₂ · topic_match(query_topic, fact.topic)
+ w₃ · recency(fact.generatedAtTime)
+ w₄ · confidence(fact.confidence)
+ w₅ · graph_proximity(context_iris, fact)
Phase 1 (this ADR): ORM-based filtering with Q combinators — topic match, confidence threshold, recency sort. No embedding similarity yet.
Phase 2 (future): Hybrid retrieval combining SPARQL graph traversal with vector similarity from trails.vector (ADR-0019).
5. Cross-Agent Sharing via Federation¶
Facts with scope: "shared" are visible to federated peers. The sharing model follows ADR-0023:
Instance A (Claude Code) Instance B (Copilot)
┌──────────────────────┐ ┌──────────────────────┐
│ memory.learn(...) │ │ memory.recall(...) │
│ scope: "shared" │ │ │
│ │ │ → local facts │
│ /.well-known/ │◄───│ → SERVICE query to A │
│ brain/sparql │ │ → merged results │
└──────────────────────┘ └──────────────────────┘
Cedar policy gates all federated recall. A peer can only see facts that the local instance's policy allows. The DID of the requesting agent is the Cedar principal.
Cost attribution: Federated recall charges the requesting instance's cost envelope (ADR-0012). The serving instance meters the query cost and returns it in the response.
6. Staleness and Compaction¶
Facts carry a staleness policy:
| Policy | TTL | Compact behavior |
|---|---|---|
stable |
∞ | Never auto-pruned |
weekly |
7d | Flagged after 7d, pruned on compact |
daily |
24h | Flagged after 24h |
hourly |
1h | Flagged after 1h |
volatile |
0 | Pruned on next compact |
memory.compact performs:
1. Remove retracted facts older than retention period (default 30d)
2. Merge duplicate facts (same content, different agents) into a single fact with combined provenance
3. Prune facts below confidence threshold (default 0.3)
4. Prune stale facts past their TTL
5. Return a report of what was pruned (or would be, in dry_run mode)
7. Integration with Existing ADRs¶
| ADR | Integration |
|---|---|
| ADR-0009 (Provenance) | Every memory.learn emits prov:wasGeneratedBy, prov:wasAttributedTo, prov:generatedAtTime |
| ADR-0011 (DID Identity) | Agent identity is the DID; facts are attributed to the learning agent's DID |
| ADR-0012 (Cost) | Learning and recall operations are cost-tracked; federated recall charges the requester |
| ADR-0018 (Agent Runtime) | Session can auto-persist learned facts to brain on session end |
| ADR-0023 (Federation) | Shared-scope facts queryable via SERVICE keyword; Cedar-gated |
| ADR-0037 (Hypothesis) | Hypothesis conclusions can be persisted as brain facts with confidence scores |
8. MCP Configuration¶
Any MCP-capable client (Claude Code, Copilot, custom agents) connects to a Trails brain via trails.toml:
[project]
name = "my-brain"
base_iri = "https://trails.dev/brain/my-project/"
[graph]
backend = "oxigraph"
mode = "persistent"
path = ".trails/brain.db"
[brain]
default_scope = "local" # local | shared | private
default_staleness = "weekly" # stable | weekly | daily | hourly | volatile
compact_on_startup = false # run compaction when server starts
max_facts = 100000 # soft limit, warns when exceeded
Claude Code settings.json:
{
"mcpServers": {
"trails-memory": {
"command": "trails",
"args": ["server", "--transport", "stdio"],
"cwd": "/path/to/memory-instance"
}
}
}
9. Framework Integrations¶
The memory capabilities are transport-agnostic. Six integration adapters ship with the example:
| Adapter | Target frameworks | Mechanism |
|---|---|---|
| MCP (built-in) | Claude Code, Copilot, Cursor, Zed | trails server --transport stdio/sse |
| Python SDK | Any Python: LlamaIndex, AutoGen, custom | from integrations.sdk import memory |
| LangChain | LangChain, LangGraph, CrewAI | @tool objects via get_langchain_tools() |
| OpenAI | GPT-4o, Codex, Assistants API | JSON schema via get_openai_tools() |
| Anthropic | Claude API (tool_use) | JSON schema via get_anthropic_tools() |
| HTTP | Any language (curl, Go, Rust, TS) | REST client via BrainClient(url) |
The pattern: capabilities are defined once via @capability, then projected to each framework's native tool format through thin adapters (~100-150 LOC each). No duplication — a new capability automatically appears in all adapters.
Non-goals¶
- Embedding-based similarity search — deferred to Phase 2 integration with
trails.vector(ADR-0019). Phase 1 uses ORM-based filtering. - Automatic fact extraction from conversations — the agent decides what to learn; the brain doesn't eavesdrop on chat history.
- Replacing CLAUDE.md or project instructions — the brain complements static instructions, not replaces them. Static rules go in CLAUDE.md; learned knowledge goes in the brain.
- Multi-tenant brain — one brain instance per project/workspace. Cross-project knowledge sharing happens via federation, not shared tenancy.
- Bayesian belief networks — confidence is a scalar float estimated by the learning agent, not a probabilistic graphical model.
Consequences¶
Positive¶
- Agents stop being amnesiac: knowledge persists across sessions and tools
- Cross-agent collaboration becomes possible: Claude Code and Copilot share a brain
- Audit trail for agent knowledge: every fact has provenance — essential for regulated industries
- Federation enables cross-org knowledge sharing: manufacturer and NB agents share regulatory knowledge under policy control
- Progressive enhancement: start with
memory.learn/memory.recall, add federation and policies when needed - Trails differentiator: no other framework offers relational + temporal + provenance + federated agent memory
Negative¶
- Graph size growth: agents may over-learn, creating noise. Mitigation: staleness policies + compaction + confidence thresholds.
- Recall latency: graph queries are slower than key-value lookups. Mitigation: indexed SPARQL, topic-based partitioning.
- Complexity for simple use cases: a single-agent notepad doesn't need a KG. Mitigation:
memory.learn/memory.recallcan be used without understanding the ontology.
Neutral¶
- Brain facts are regular KG nodes — they participate in SHACL validation, temporal queries, and all other Trails features without special casing.
- The
brain:namespace is a convention, not a hard requirement. Users can define their own memory ontology and still use the capabilities.
Revisit conditions¶
- When
trails.vector(ADR-0019) is stable: integrate embedding similarity intomemory.recallscoring - When multi-modal nodes land: support image/PDF facts (e.g., "I saw this error screenshot")
- When a second brain consumer exists beyond Claude Code: validate the federation sharing model
Alternatives considered¶
- Vector-only memory (Chroma, Pinecone): Rejected — no relational queries, no provenance, no temporal reasoning, no federation. Good for similarity search, bad for structured knowledge.
- Conversation log persistence: Rejected — chat history is noisy, fills up fast, and can't be queried relationally. The brain should store conclusions, not conversations.
- File-based memory (current MEMORY.md approach): Rejected as sole solution — works for small projects but doesn't scale, can't be shared across agents, and has no query capability beyond grep.
- SQLite/PostgreSQL tables: Rejected — loses the RDF/SPARQL advantage (federation, ontology alignment, PROV-O integration). Would require reimplementing everything Trails already provides.
Open questions¶
- Q: Should
memory.recallsupport natural-language queries in Phase 1? Recommendation: Yes, via simple keyword extraction +Q(content__icontains=...). Full semantic search deferred to Phase 2. - Q: Should the brain auto-compact on startup? Recommendation: Configurable, default off. Users who run many short sessions may want it; long-running servers should compact on a schedule.
- Q: How does the brain interact with
trails.agent.Session(ADR-0018)? Recommendation: Session gets an optionalbrainparameter. When set, the session auto-persists key conclusions (hypotheses above confidence threshold, user corrections) to the brain onsession.end(). - Q: Should facts have a
verified_byfield for human confirmation? Recommendation: Yes, add in Phase 1. A booleanbrain:humanVerifiedproperty lets agents distinguish between self-learned and human-confirmed knowledge.