ADR-0053: Memory Trust Boundaries — Read Isolation, Data Classification, and Federation Trust¶
- Status: Accepted (2026-04-19)
- Date: 2026-04-18
- Depends on: ADR-0006 (Cedar Policy), ADR-0011 (DID Identity), ADR-0023 (Federation), ADR-0051 (Agent Memory), ADR-0052 (Memory Security)
- Target milestone: M11
Context¶
ADR-0052 established application layer separation for the memory write path: agents cannot forge identity, inflate confidence, or tamper with provenance. But the write path is only half the trust model. Five additional separation concerns remain unaddressed:
- Read path isolation — which agents can see which facts?
- Computation vs. storage — process-level separation of gateway and graph store
- Content vs. metadata — PII in fact content vs. non-PII provenance (GDPR)
- Inference vs. ground truth — agent-derived conclusions vs. externally sourced facts
- Federation trust — cross-instance fact exchange under adversarial conditions
Each of these is a trust boundary where confusion leads to data leakage, compliance violations, or trust chain corruption.
Decision¶
1. Read Path Isolation¶
Problem¶
Any agent connected to the memory can recall any fact matching its scope filter. A code assistant agent recalling security vulnerability details could leak them into generated code, PR descriptions, or LLM context that gets logged by third-party services.
Security scanner learns:
"CVE-2026-1234 affects auth module — RCE via crafted JWT"
topic: "security", scope: "shared"
Code assistant recalls (unintentionally):
memory.recall(context="auth module JWT")
→ Gets the CVE detail in its LLM context
→ Includes it in a PR description on GitHub
Solution: Topic-Scoped Read Policies¶
Extend Cedar policies to cover memory.recall with topic-level read restrictions:
// Security scanner can read all topics
permit(
principal == Agent::"did:key:z6MkSecScanner",
action == Action::"memory.recall",
resource
);
// Code assistant cannot read security-classified topics
forbid(
principal == Agent::"did:key:z6MkCodeAssist",
action == Action::"memory.recall",
resource
) when {
resource has classification &&
resource.classification == "security"
};
// Default: agents can read topics they have written to
permit(
principal,
action == Action::"memory.recall",
resource
) when {
resource has topic &&
principal has writable_topics &&
resource.topic in principal.writable_topics
};
Fact Classification Levels¶
Facts carry an optional classification that controls read access independently of scope:
class FactClassification:
PUBLIC = "public" # any agent can read
INTERNAL = "internal" # authenticated agents only
RESTRICTED = "restricted" # agents with matching topic clearance
CONFIDENTIAL = "confidential" # named agents only (explicit grant)
SECURITY = "security" # security-cleared agents only
The gateway injects classification based on topic rules in trails.toml:
[memory.classification]
default = "internal"
[memory.classification.rules]
security = "security" # topic "security" → classification "security"
vulnerability = "security"
pii = "confidential"
clinical = "confidential"
config = "internal"
general = "public"
LLM Context Leakage Prevention¶
When a fact enters an LLM's context window, it effectively becomes part of that LLM's training signal for the session. For classified facts, the gateway can:
- Redact — replace sensitive content with a reference:
"[CLASSIFIED: security fact #42 — use memory.explain for details with clearance]" - Summarize — return a sanitized summary instead of the raw content
- Deny — return zero results with an explanation of why
[memory.classification.leakage_policy]
security = "deny" # never surface to non-cleared agents
confidential = "redact" # show existence but not content
restricted = "summarize" # LLM-generated summary without raw detail
internal = "allow" # full content
public = "allow"
2. Computation vs. Storage Separation¶
Problem¶
In the Phase 1 example, gateway logic and graph storage share a Python process. A vulnerability in the gateway (e.g., SPARQL injection that bypasses the proxy) gives direct access to the storage layer, bypassing all policy enforcement.
Solution: Process-Level Isolation¶
For production deployments, the memory system runs as three separate processes:
┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
│ MCP/HTTP │ │ Memory Gateway │ │ Oxigraph │
│ Transport │────▶│ │────▶│ (Storage) │
│ │ │ - Auth │ │ │
│ Agents │◀────│ - Cedar eval │◀────│ Provenance │
│ connect │ │ - Confidence │ │ graph │
│ here │ │ - Hash chain │ │ (append- │
│ │ │ - Rate limit │ │ only) │
└──────────────┘ └──────────────────┘ └──────────────┘
Container A Container B Container C
Network: public Network: internal Network: isolated
Network segmentation: - Transport layer: public-facing (MCP stdio or HTTPS) - Gateway: internal network only, reachable from transport - Storage: isolated network, reachable only from gateway
Storage connection: - Gateway connects to Oxigraph via SPARQL HTTP protocol (not in-process) - Gateway uses a dedicated service account with restricted permissions - Storage exposes only the SPARQL endpoint, no admin interface
Progressive deployment: - Development: all in one process (current, fine for local use) - Staging: gateway + storage in separate containers, same host - Production: full network isolation per the diagram above
[memory.deployment]
mode = "integrated" # integrated | separated | isolated
[memory.deployment.storage]
endpoint = "http://oxigraph:7878/store" # when mode != integrated
service_account = "did:key:z6MkGateway"
[memory.deployment.gateway]
bind = "127.0.0.1:8086" # internal only
3. Content vs. Metadata Separation¶
Problem¶
Fact content may contain PII or regulated data. Fact metadata (provenance, confidence, timestamps) generally doesn't. These have different lifecycle requirements:
| Requirement | Content | Metadata/Provenance |
|---|---|---|
| GDPR erasure | Must support deletion | Must survive (audit trail) |
| Data residency | May require geographic constraints | Usually unconstrained |
| Encryption at rest | Required for PII | Optional |
| Retention period | Bounded (GDPR) | Extended (audit) |
| Right to access | Subject can request their data | Operator access only |
Solution: Split Storage with Tombstones¶
Store fact content and fact provenance in separate named graphs with independent lifecycle policies:
# Content graph — erasable
GRAPH <memory:content> {
ex:fact-42 brain:content "Patient Schmidt has allergy to penicillin" .
ex:fact-42 brain:topic "clinical" .
ex:fact-42 brain:tags "allergy", "penicillin" .
}
# Provenance graph — append-only, survives content erasure
GRAPH <memory:provenance> {
ex:prov-42 a prov:Activity ;
prov:wasAssociatedWith did:key:agent-a ;
prov:generatedAtTime "2026-04-18T10:00:00Z" ;
brain:action "memory.learn" ;
brain:factIri ex:fact-42 ;
brain:topic "clinical" ; # topic is metadata, not content
brain:classification "confidential" ;
brain:selfHash "sha256:abc..." .
}
On GDPR erasure request:
def erase_fact(fact_iri: str, erasure_request_id: str):
# 1. Delete content from content graph
delete_from_graph("memory:content", fact_iri)
# 2. Write tombstone to provenance graph (append, not delete)
write_tombstone("memory:provenance", {
"fact_iri": fact_iri,
"erased_at": now(),
"erasure_request": erasure_request_id,
"reason": "GDPR Art. 17 — right to erasure",
# Content is gone. Provenance record remains:
# "a clinical fact existed, was learned by agent-a at time T,
# and was erased under request #123"
})
Content encryption:
[memory.content]
encryption = "aes-256-gcm" # none | aes-256-gcm
key_source = "env:MEMORY_CONTENT_KEY" # or "vault:trails/memory-key"
[memory.content.residency]
allowed_regions = ["eu-west"] # for geo-fenced deployments
The provenance graph is never encrypted (it contains no PII) and never deleted (audit requirements).
4. Inference vs. Ground Truth Separation¶
Problem¶
An agent reads a CVE database and stores: "mbedTLS 3.4.0 has CVE-2023-43615". Then it reasons: "The BLE export module is vulnerable because it uses mbedTLS 3.4.0". Both facts get the same confidence (0.95) and the same source attestation. But they have fundamentally different epistemic status:
- The CVE fact is ground truth — externally verifiable, authoritative source
- The vulnerability inference is a derived conclusion — depends on the agent's reasoning quality and whether the BLE module actually uses the affected mbedTLS code path
Solution: Epistemic Status Tagging¶
Every fact carries an epistemic_status field set by the gateway:
class EpistemicStatus:
GROUND_TRUTH = "ground-truth"
# Directly sourced from an authoritative external system.
# Source attestation must be TOOL_OBSERVED or higher.
# Examples: CVE from NVD, test result from CI, config from file.
OBSERVATION = "observation"
# Agent directly observed this (file content, error message, behavior).
# Source attestation is TOOL_OBSERVED (the agent used a capability).
# Examples: "file X contains function Y", "test Z failed with error W".
INFERENCE = "inference"
# Agent derived this from other facts through reasoning.
# Must link to supporting facts via brain:Link (relationship: "supports").
# Examples: "module X is vulnerable because it uses library Y".
HYPOTHESIS = "hypothesis"
# Agent's working theory, not yet validated.
# May be promoted to INFERENCE after supporting evidence is found.
# Integrates with ADR-0037 (Hypothesis-Driven Agents).
CONSENSUS = "consensus"
# Multiple agents or a human have confirmed this.
# Requires N independent confirmations (configurable).
# Highest internal trust level.
HEARSAY = "hearsay"
# Received from a federated peer without independent verification.
# Default status for all inbound federated facts.
Gateway rules for epistemic status assignment:
def _assign_epistemic_status(
source_attestation: SourceAttestation,
has_supporting_links: bool,
federation_source: bool,
) -> EpistemicStatus:
if federation_source:
return EpistemicStatus.HEARSAY
if source_attestation in (SourceAttestation.SCITT_ANCHORED,
SourceAttestation.CONTENT_HASHED):
return EpistemicStatus.GROUND_TRUTH
if source_attestation == SourceAttestation.TOOL_OBSERVED:
return EpistemicStatus.OBSERVATION
if has_supporting_links:
return EpistemicStatus.INFERENCE
return EpistemicStatus.INFERENCE # self-reported reasoning
Recall ranking incorporates epistemic status:
EPISTEMIC_WEIGHT = {
"ground-truth": 1.0,
"consensus": 0.95,
"observation": 0.85,
"inference": 0.7,
"hypothesis": 0.5,
"hearsay": 0.3,
}
def recall_score(fact):
return (
fact.confidence
* EPISTEMIC_WEIGHT[fact.epistemic_status]
* recency_factor(fact.learned_at)
)
Inference provenance chain: When an agent stores an inference, it must link to the supporting facts:
# Agent learns an inference
inf = trails.invoke("memory.learn", {
"content": "BLE export module is vulnerable via mbedTLS CVE-2023-43615",
"topic": "security",
"confidence_hint": 0.85,
})
# Agent links it to the supporting ground truth
trails.invoke("memory.link", {
"from_iri": cve_fact_iri, # ground truth
"to_iri": inf["payload"]["iri"], # inference
"relationship": "supports",
})
If an inference has no supporting links, the gateway downgrades its epistemic status:
def _validate_inference(fact_iri, links):
supporting = [l for l in links if l.relationship == "supports"]
if not supporting:
# No evidence → downgrade to hypothesis
return EpistemicStatus.HYPOTHESIS
return EpistemicStatus.INFERENCE
5. Federation Trust Boundaries¶
Problem¶
When two Trails instances exchange facts via federation, the receiving instance must decide how much to trust the remote facts. The current model (ADR-0023) gates access via Cedar policies and cost attribution, but doesn't address:
- What happens to a remote fact's confidence and attestation?
- Should trust be transitive? (A trusts B, B trusts C → does A trust C's facts?)
- How to handle conflicting facts from different instances?
- What if a remote instance's hash chain is invalid?
Solution: Inbound Federation Trust Model¶
5a. Inbound Quarantine¶
All facts received from a federated peer enter quarantine by default:
Remote Instance Local Instance
┌──────────────────┐ ┌──────────────────────────────┐
│ Shared facts │──SERVICE──▶ │ Quarantine graph │
│ (peer's memory) │ query │ (scope: "federation:pending")│
└──────────────────┘ │ │
│ After verification: │
│ ├── promote to "shared" │
│ ├── keep in quarantine │
│ └── reject (mark untrusted) │
└──────────────────────────────┘
[memory.federation.inbound]
default_action = "quarantine" # quarantine | accept | reject
auto_promote_after_days = 0 # 0 = manual only
auto_promote_min_attestation = "content-hashed" # minimum to auto-promote
[memory.federation.inbound.peer_overrides]
"did:web:tuvsud.com" = { default_action = "accept", trust_level = "established" }
5b. Trust Downgrade on Reception¶
Remote facts are received with degraded trust, regardless of the remote instance's internal ratings:
def _receive_federated_fact(remote_fact, peer_did):
local_fact = copy(remote_fact)
# Epistemic status → hearsay (always)
local_fact.epistemic_status = EpistemicStatus.HEARSAY
# Confidence → capped by peer trust level
peer_trust = resolve_peer_trust(peer_did) # 0.0 - 1.0
local_fact.confidence = remote_fact.confidence * peer_trust
# Source attestation → downgraded
# Remote "tool-observed" becomes local "self-reported"
# (we didn't observe the tool invocation)
local_fact.source_attestation = SourceAttestation.SELF_REPORTED
# Attribution → preserved but tagged as remote
local_fact.source = f"federation:{peer_did} (originally: {remote_fact.source})"
return local_fact
5c. Trust Non-Transitivity¶
Trust is explicitly non-transitive by default:
Instance A trusts Instance B (peer_trust = 0.8)
Instance B trusts Instance C (peer_trust = 0.9)
Instance A does NOT automatically trust Instance C.
If A queries B and B's response includes facts originally from C, those facts are treated as B's facts (with B's peer trust applied). A doesn't know or care about C — B is the trust boundary.
To enable controlled transitivity:
[memory.federation.trust]
transitive = false # default: trust is NOT transitive
max_hop_depth = 1 # even if transitive, max 1 hop
transitive_decay = 0.5 # each hop halves the trust score
5d. Hash Chain Verification on Peer Handshake¶
Before accepting facts from a federated peer, the gateway verifies the remote provenance hash chain:
def _verify_peer_chain(peer_did, facts):
# Request the peer's hash chain segment covering these facts
chain = federation_client.get_provenance_chain(
peer_did,
fact_iris=[f.iri for f in facts]
)
# Verify chain integrity
for i, record in enumerate(chain):
expected_hash = compute_hash(record, chain[i-1].self_hash if i > 0 else GENESIS)
if record.self_hash != expected_hash:
flag_chain_break(peer_did, record)
quarantine_all_facts_from(peer_did, after=record.timestamp)
return VerificationResult.CHAIN_BROKEN
return VerificationResult.VERIFIED
On chain break detection: 1. All facts from the peer after the break point are quarantined 2. An audit event is emitted 3. The peer's trust level is reduced 4. The operator is alerted 5. No automatic recovery — human investigation required
5e. Conflict Resolution for Cross-Instance Facts¶
When local and remote facts contradict:
class ConflictStrategy:
LOCAL_WINS = "local-wins" # keep local, quarantine remote
HIGHEST_CONFIDENCE = "highest" # keep the one with higher effective confidence
BOTH_VISIBLE = "both" # keep both, link with "contradicts"
HUMAN_REVIEW = "human-review" # quarantine both, flag for review
[memory.federation.conflicts]
default_strategy = "both" # preserve both perspectives
security_topic_strategy = "human-review" # security conflicts need human eyes
When both strategy is used, the gateway automatically creates a contradicts link:
def _handle_conflict(local_fact, remote_fact):
trails.invoke("memory.link", {
"from_iri": local_fact.iri,
"to_iri": remote_fact.iri,
"relationship": "contradicts",
})
# Both facts remain visible; recall surfaces the conflict
6. Separation Summary¶
┌────────────────────────────────────────────────────────────────┐
│ Trust Boundary Map │
│ │
│ ┌─────────┐ ┌──────────────┐ ┌────────────┐ ┌─────────┐ │
│ │ Agent │→│ Gateway │→│ Content │ │Provenance│ │
│ │ (write) │ │ (enforce) │ │ Graph │ │Graph │ │
│ │ │ │ │ │ (erasable)│ │(append- │ │
│ │ ADR-0052│ │ ADR-0052 │ │ │ │ only) │ │
│ └─────────┘ └──────────────┘ │ §3 this │ │§3 this │ │
│ │ ADR │ │ADR │ │
│ ┌─────────┐ ┌──────────────┐ └────────────┘ └─────────┘ │
│ │ Agent │←│ Gateway │ │
│ │ (read) │ │ (filter) │ §1 this ADR: read policies, │
│ │ │ │ │ classification, LLM leakage │
│ └─────────┘ └──────────────┘ │
│ │
│ ┌─────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ Remote │→│ Federation │→│ Quarantine Graph │ │
│ │ Peer │ │ Gateway │ │ (pending verification) │ │
│ │ │ │ (verify + │ │ │ │
│ │ │ │ downgrade) │ │ §5 this ADR │ │
│ └─────────┘ └──────────────┘ └────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Epistemic Layer: ground-truth > observation > │ │
│ │ inference > hypothesis > hearsay §4 this ADR │ │
│ └────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
7. Configuration Overview¶
# §1 Read isolation
[memory.classification]
default = "internal"
[memory.classification.rules]
security = "security"
vulnerability = "security"
pii = "confidential"
clinical = "confidential"
[memory.classification.leakage_policy]
security = "deny"
confidential = "redact"
restricted = "summarize"
# §2 Process separation
[memory.deployment]
mode = "integrated" # integrated | separated | isolated
# §3 Content vs. metadata
[memory.content]
encryption = "none" # none | aes-256-gcm
separate_graphs = false # true enables split content/provenance graphs
[memory.content.erasure]
support_gdpr_erasure = false
tombstone_retention_years = 10
# §4 Epistemic status
[memory.epistemic]
require_supporting_links_for_inference = false # true = inference without links → hypothesis
consensus_min_confirmations = 2
# §5 Federation trust
[memory.federation.inbound]
default_action = "quarantine"
auto_promote_after_days = 0
[memory.federation.trust]
transitive = false
max_hop_depth = 1
[memory.federation.conflicts]
default_strategy = "both"
Non-goals¶
- Content moderation — determining whether fact content is offensive, misleading, or inappropriate. The memory system tracks provenance and trust, not content quality.
- Formal information flow control (IFC) — taint tracking through LLM inference chains. This is an open research problem; we provide classification and read policies as pragmatic approximations.
- Distributed consensus — no Raft/PBFT between memory instances. Federation is eventually consistent with explicit trust boundaries.
- Key management — content encryption keys are sourced from environment variables or external vaults. Key rotation, HSM integration, and key escrow are out of scope.
Consequences¶
Positive¶
- Defense in depth — five independent trust boundaries, each enforceable regardless of the others
- GDPR-ready — content/metadata split enables right-to-erasure while preserving audit trail
- Epistemic clarity — consumers know whether a fact is ground truth, inference, or hearsay
- Federation safety — remote facts cannot silently override local knowledge
- Progressive — every feature defaults to OFF; production enables what it needs
Negative¶
- Complexity — five separation concerns with independent configuration. Mitigation: sensible defaults mean most users never touch these settings.
- Performance — classification checks on every recall add overhead. Mitigation: Cedar policy evaluation is O(1); the real bottleneck remains the graph query.
- Cold start for federation — new peers start in quarantine with degraded trust. Mitigation: peer trust overrides in config for known partners.
Neutral¶
- The Phase 1 example app (ADR-0051) continues to work unchanged — all features default to OFF.
- These boundaries are enforced by the gateway, not by the ORM. Agents using
trails.invoke()directly see the same API; the gateway intercepts transparently.
Dependencies¶
| ADR | Relationship |
|---|---|
| ADR-0006 | Cedar policies for read isolation and correction policies |
| ADR-0009 | PROV-O provenance graph as the append-only audit layer |
| ADR-0011 | DIDs for agent identity and federation peer identity |
| ADR-0023 | Federation protocol for cross-instance fact exchange |
| ADR-0037 | Hypothesis agent lifecycle integrates with epistemic status |
| ADR-0051 | Agent Memory — the capability surface these boundaries protect |
| ADR-0052 | Write-path security — this ADR extends to read path, storage, federation |
Open questions¶
- Q: Should classification be set by the agent or derived from topic rules? Recommendation: Derived from topic rules (agent cannot self-classify). Override via Cedar policy for edge cases.
- Q: Should content encryption be per-fact or per-graph? Recommendation: Per-graph (all facts in the content graph share the same key). Per-fact encryption is too expensive for the common case.
- Q: How should epistemic status interact with compaction? Recommendation: Ground truth facts should have
staleness: "stable"by default (never auto-pruned). Inferences and hypotheses use the normal staleness policy. - Q: Should the quarantine graph be visible to
memory.status? Recommendation: Yes —memory.statusshould report quarantine counts so operators know when facts need review.