Skip to content

ADR-0051: Agent Memory — Persistent Shared Memory for AI Agents

  • Status: Accepted
  • Date: 2026-04-18
  • Depends on: ADR-0009 (Provenance Always On), ADR-0011 (DID Identity), ADR-0018 (Agent Runtime), ADR-0023 (Federation), ADR-0037 (Hypothesis-Driven Loops)
  • Target milestone: M10

Context

AI agents today suffer from persistent amnesia. Every session starts from scratch — the agent re-discovers project structure, re-learns user preferences, and re-derives conclusions it already reached yesterday. Current approaches to agent memory are inadequate:

Approach Problem
Markdown files (CLAUDE.md, MEMORY.md) Flat text, no relational queries, no provenance, local to one agent
Vector databases Embedding similarity only — no structured reasoning, no temporal queries, no trust chain
Conversation history Dies with the session, can't be shared, fills context windows
System prompts Static, no learning, no cross-agent sharing

The root issue: memory is treated as a file, not as a service. What agents need is a persistent, structured, queryable, shareable knowledge store with provenance — a brain.

Trails already has every primitive required: - Knowledge graph (Oxigraph) for structured relational storage - PROV-O (ADR-0009) for tracking who learned what and when - Temporal queries for "what did we know at time T?" - DIDs (ADR-0011) for agent identity - Federation (ADR-0023) for cross-instance knowledge sharing - Cedar policies (ADR-0006) for access control on knowledge - MCP surface for tool exposure to any MCP-capable client - Cost envelopes (ADR-0012) for tracking the cost of learning

What's missing is an opinionated memory protocol — a set of capabilities and an ontology that turns the raw graph into a usable agent memory.

Use cases driving this ADR

  1. Cross-agent continuity: Claude Code learns a fact in the morning; Copilot uses it in the afternoon
  2. Multi-agent collaboration: Agent A and Agent B working on the same codebase share a single knowledge store instead of diverging
  3. Federated knowledge: A manufacturer's agent and a notified body's agent share regulatory knowledge across organizational boundaries (submission-access use case)
  4. Auditable reasoning: Every fact has a provenance chain — which agent learned it, from what source, with what confidence
  5. Temporal reasoning: "What requirements changed since the NB last reviewed?" is a graph query, not a log grep

Decision

1. Brain Ontology

Introduce a brain: namespace (https://trails.dev/ns/brain/v1#) with the following classes:

@prefix brain: <https://trails.dev/ns/brain/v1#> .
@prefix prov:  <http://www.w3.org/ns/prov#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .

# ── Core classes ─────────────────────────────────────────

brain:Fact a owl:Class ;
    rdfs:subClassOf prov:Entity ;
    rdfs:comment "A learned piece of knowledge with provenance and confidence." .

brain:Correction a owl:Class ;
    rdfs:subClassOf prov:Entity ;
    rdfs:comment "A fact that supersedes a previous fact (retraction or update)." .

brain:Link a owl:Class ;
    rdfs:subClassOf prov:Entity ;
    rdfs:comment "An explicit relationship between two facts." .

brain:Topic a owl:Class ;
    rdfs:comment "A subject area for organizing facts." .

brain:Source a owl:Class ;
    rdfs:subClassOf prov:Entity ;
    rdfs:comment "Origin of a fact: file, URL, conversation, observation." .

# ── Properties ───────────────────────────────────────────

brain:content a owl:DatatypeProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range xsd:string ;
    rdfs:comment "The natural-language content of the fact." .

brain:confidence a owl:DatatypeProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range xsd:float ;
    rdfs:comment "Confidence score [0.0, 1.0]." .

brain:topic a owl:ObjectProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range brain:Topic .

brain:source a owl:ObjectProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range brain:Source .

brain:supersedes a owl:ObjectProperty ;
    rdfs:domain brain:Correction ;
    rdfs:range brain:Fact ;
    rdfs:comment "The prior fact this correction replaces." .

brain:relatedTo a owl:ObjectProperty ;
    rdfs:domain brain:Link ;
    rdfs:range brain:Fact .

brain:relationship a owl:DatatypeProperty ;
    rdfs:domain brain:Link ;
    rdfs:range xsd:string ;
    rdfs:comment "Nature of the link: causes, requires, contradicts, supports, etc." .

brain:scope a owl:DatatypeProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range xsd:string ;
    rdfs:comment "Visibility scope: local (this instance), shared (federated), private (agent-only)." .

brain:staleness a owl:DatatypeProperty ;
    rdfs:domain brain:Fact ;
    rdfs:range xsd:string ;
    rdfs:comment "Decay policy: stable, hourly, daily, weekly, volatile." .

2. Fact Lifecycle

                    memory.learn
    ┌─────────────────────────────────┐
    │         brain:Fact              │
    │  content, confidence, topic,   │
    │  source, scope, staleness      │
    │  prov:wasAttributedTo (DID)    │
    │  prov:generatedAtTime          │
    ├─────────────────────────────────┤
    │                                 │
    │  memory.correct ──► Correction   │
    │  (supersedes original,          │
    │   records reason + new value)   │
    │                                 │
    │  memory.forget ──► Retraction    │
    │  (marks as retracted, keeps     │
    │   audit trail, never hard-      │
    │   deletes)                      │
    │                                 │
    │  memory.link ──► Link            │
    │  (typed relationship to         │
    │   another fact or external IRI) │
    │                                 │
    └─────────────────────────────────┘

Key principle: facts are never hard-deleted. A memory.forget marks the fact as retracted with a prov:wasInvalidatedBy trail. This ensures the audit chain is complete — you can always answer "what did the agent believe at time T, and why did it stop believing it?"

3. Capability Surface (10 MCP Tools)

Write capabilities

Capability Description Key parameters
memory.learn Store a new fact with provenance content, confidence, topic, source, scope, staleness
memory.correct Supersede a previous fact fact_iri, new_content, reason, new_confidence
memory.forget Retract a fact (soft delete) fact_iri, reason
memory.link Create typed relationship between facts from_iri, to_iri, relationship

Read capabilities

Capability Description Key parameters
memory.recall Context-aware retrieval context, topic, min_confidence, max_age, limit
memory.query Structured SPARQL query sparql
memory.explain Provenance chain for a fact fact_iri, depth
memory.diff What changed since timestamp? since, topic, include_corrections

Meta capabilities

Capability Description Key parameters
memory.status Graph stats: size, staleness, agents
memory.compact Merge low-confidence, prune stale min_confidence, max_staleness, dry_run

4. Recall Strategy

memory.recall is the primary read path. It combines multiple signals, not just text similarity:

Score(fact) = w₁ · text_relevance(context, fact.content)
            + w₂ · topic_match(query_topic, fact.topic)
            + w₃ · recency(fact.generatedAtTime)
            + w₄ · confidence(fact.confidence)
            + w₅ · graph_proximity(context_iris, fact)

Phase 1 (this ADR): ORM-based filtering with Q combinators — topic match, confidence threshold, recency sort. No embedding similarity yet.

Phase 2 (future): Hybrid retrieval combining SPARQL graph traversal with vector similarity from trails.vector (ADR-0019).

5. Cross-Agent Sharing via Federation

Facts with scope: "shared" are visible to federated peers. The sharing model follows ADR-0023:

Instance A (Claude Code)           Instance B (Copilot)
┌──────────────────────┐    ┌──────────────────────┐
│  memory.learn(...)    │    │  memory.recall(...)    │
│  scope: "shared"     │    │                       │
│                      │    │  → local facts        │
│  /.well-known/       │◄───│  → SERVICE query to A │
│    brain/sparql      │    │  → merged results     │
└──────────────────────┘    └──────────────────────┘

Cedar policy gates all federated recall. A peer can only see facts that the local instance's policy allows. The DID of the requesting agent is the Cedar principal.

Cost attribution: Federated recall charges the requesting instance's cost envelope (ADR-0012). The serving instance meters the query cost and returns it in the response.

6. Staleness and Compaction

Facts carry a staleness policy:

Policy TTL Compact behavior
stable Never auto-pruned
weekly 7d Flagged after 7d, pruned on compact
daily 24h Flagged after 24h
hourly 1h Flagged after 1h
volatile 0 Pruned on next compact

memory.compact performs: 1. Remove retracted facts older than retention period (default 30d) 2. Merge duplicate facts (same content, different agents) into a single fact with combined provenance 3. Prune facts below confidence threshold (default 0.3) 4. Prune stale facts past their TTL 5. Return a report of what was pruned (or would be, in dry_run mode)

7. Integration with Existing ADRs

ADR Integration
ADR-0009 (Provenance) Every memory.learn emits prov:wasGeneratedBy, prov:wasAttributedTo, prov:generatedAtTime
ADR-0011 (DID Identity) Agent identity is the DID; facts are attributed to the learning agent's DID
ADR-0012 (Cost) Learning and recall operations are cost-tracked; federated recall charges the requester
ADR-0018 (Agent Runtime) Session can auto-persist learned facts to brain on session end
ADR-0023 (Federation) Shared-scope facts queryable via SERVICE keyword; Cedar-gated
ADR-0037 (Hypothesis) Hypothesis conclusions can be persisted as brain facts with confidence scores

8. MCP Configuration

Any MCP-capable client (Claude Code, Copilot, custom agents) connects to a Trails brain via trails.toml:

[project]
name = "my-brain"
base_iri = "https://trails.dev/brain/my-project/"

[graph]
backend = "oxigraph"
mode = "persistent"
path = ".trails/brain.db"

[brain]
default_scope = "local"          # local | shared | private
default_staleness = "weekly"     # stable | weekly | daily | hourly | volatile
compact_on_startup = false       # run compaction when server starts
max_facts = 100000               # soft limit, warns when exceeded

Claude Code settings.json:

{
  "mcpServers": {
    "trails-memory": {
      "command": "trails",
      "args": ["server", "--transport", "stdio"],
      "cwd": "/path/to/memory-instance"
    }
  }
}

9. Framework Integrations

The memory capabilities are transport-agnostic. Six integration adapters ship with the example:

Adapter Target frameworks Mechanism
MCP (built-in) Claude Code, Copilot, Cursor, Zed trails server --transport stdio/sse
Python SDK Any Python: LlamaIndex, AutoGen, custom from integrations.sdk import memory
LangChain LangChain, LangGraph, CrewAI @tool objects via get_langchain_tools()
OpenAI GPT-4o, Codex, Assistants API JSON schema via get_openai_tools()
Anthropic Claude API (tool_use) JSON schema via get_anthropic_tools()
HTTP Any language (curl, Go, Rust, TS) REST client via BrainClient(url)

The pattern: capabilities are defined once via @capability, then projected to each framework's native tool format through thin adapters (~100-150 LOC each). No duplication — a new capability automatically appears in all adapters.

Non-goals

  • Embedding-based similarity search — deferred to Phase 2 integration with trails.vector (ADR-0019). Phase 1 uses ORM-based filtering.
  • Automatic fact extraction from conversations — the agent decides what to learn; the brain doesn't eavesdrop on chat history.
  • Replacing CLAUDE.md or project instructions — the brain complements static instructions, not replaces them. Static rules go in CLAUDE.md; learned knowledge goes in the brain.
  • Multi-tenant brain — one brain instance per project/workspace. Cross-project knowledge sharing happens via federation, not shared tenancy.
  • Bayesian belief networks — confidence is a scalar float estimated by the learning agent, not a probabilistic graphical model.

Consequences

Positive

  • Agents stop being amnesiac: knowledge persists across sessions and tools
  • Cross-agent collaboration becomes possible: Claude Code and Copilot share a brain
  • Audit trail for agent knowledge: every fact has provenance — essential for regulated industries
  • Federation enables cross-org knowledge sharing: manufacturer and NB agents share regulatory knowledge under policy control
  • Progressive enhancement: start with memory.learn/memory.recall, add federation and policies when needed
  • Trails differentiator: no other framework offers relational + temporal + provenance + federated agent memory

Negative

  • Graph size growth: agents may over-learn, creating noise. Mitigation: staleness policies + compaction + confidence thresholds.
  • Recall latency: graph queries are slower than key-value lookups. Mitigation: indexed SPARQL, topic-based partitioning.
  • Complexity for simple use cases: a single-agent notepad doesn't need a KG. Mitigation: memory.learn/memory.recall can be used without understanding the ontology.

Neutral

  • Brain facts are regular KG nodes — they participate in SHACL validation, temporal queries, and all other Trails features without special casing.
  • The brain: namespace is a convention, not a hard requirement. Users can define their own memory ontology and still use the capabilities.

Revisit conditions

  • When trails.vector (ADR-0019) is stable: integrate embedding similarity into memory.recall scoring
  • When multi-modal nodes land: support image/PDF facts (e.g., "I saw this error screenshot")
  • When a second brain consumer exists beyond Claude Code: validate the federation sharing model

Alternatives considered

  1. Vector-only memory (Chroma, Pinecone): Rejected — no relational queries, no provenance, no temporal reasoning, no federation. Good for similarity search, bad for structured knowledge.
  2. Conversation log persistence: Rejected — chat history is noisy, fills up fast, and can't be queried relationally. The brain should store conclusions, not conversations.
  3. File-based memory (current MEMORY.md approach): Rejected as sole solution — works for small projects but doesn't scale, can't be shared across agents, and has no query capability beyond grep.
  4. SQLite/PostgreSQL tables: Rejected — loses the RDF/SPARQL advantage (federation, ontology alignment, PROV-O integration). Would require reimplementing everything Trails already provides.

Open questions

  • Q: Should memory.recall support natural-language queries in Phase 1? Recommendation: Yes, via simple keyword extraction + Q(content__icontains=...). Full semantic search deferred to Phase 2.
  • Q: Should the brain auto-compact on startup? Recommendation: Configurable, default off. Users who run many short sessions may want it; long-running servers should compact on a schedule.
  • Q: How does the brain interact with trails.agent.Session (ADR-0018)? Recommendation: Session gets an optional brain parameter. When set, the session auto-persists key conclusions (hypotheses above confidence threshold, user corrections) to the brain on session.end().
  • Q: Should facts have a verified_by field for human confirmation? Recommendation: Yes, add in Phase 1. A boolean brain:humanVerified property lets agents distinguish between self-learned and human-confirmed knowledge.