ADR-0038: Explainable Provenance — Citation Graphs and Confidence Propagation¶
- Status: Accepted (2026-04-19)
- Date: 2026-04-18
Context¶
Trails records comprehensive PROV-O provenance for every capability
invocation (ADR-0009): activities, agents, derivation chains, and
timestamps all land in the trails:prov named graph. But the raw
provenance triples are machine-readable, not human-readable. When a user
asks "why did the system produce this answer?", the framework has no way
to surface a coherent explanation without manual SPARQL spelunking.
Concrete gaps:
- No citation trail. An agent answer references KG nodes internally, but there is no structured link from the answer back to the specific nodes that informed it. Users cannot audit which data drove a conclusion.
- No confidence propagation. If a source node carries a confidence score of 0.8, conclusions derived from it should inherit bounded confidence — but today provenance records derivation without quantifying trust attenuation.
- No counterfactual reasoning. "What would change if we removed this source?" is unanswerable without traversing the full derivation graph by hand.
- No human-readable explanations. PROV-O triples are precise but opaque. Developers and end-users need natural-language or structured summaries: "This answer was produced by capability X, which read nodes Y and Z."
These gaps are blockers for regulated industries (GDPR right to explanation, EU AI Act transparency requirements) and for any deployment where agent outputs require audit-grade justification.
Decision¶
1. trails.explain module¶
A new top-level module providing explainability primitives that operate on the existing PROV-O graph. All operations are pure SPARQL queries over the kernel store — no AI/LLM dependency.
Core data structures:
from dataclasses import dataclass, field
@dataclass
class Citation:
node_iri: str
node_type: str
field: str
value: str
relevance: float # 0.0–1.0
accessed_by: str # capability or SPARQL that read it
@dataclass
class ReasoningStep:
step_number: int
action: str
observation: str
cited_nodes: list[str]
confidence_contribution: float
@dataclass
class Explanation:
answer: str
citations: list[Citation]
confidence: float
reasoning_chain: list[ReasoningStep]
provenance_iris: list[str]
Core functions:
explain_result(store, activity_iri)— trace back through the PROV-O graph from an activity and produce anExplanation. Walksprov:used,prov:wasInformedBy,prov:wasDerivedFromto find all source nodes.citation_graph(store, activity_iri)— return a mapping of{node_iri: [activities that used it]}. This is the "which data informed this conclusion?" graph.propagate_confidence(store, node_iri, base_confidence=1.0)— walk the derivation chain from a node, computing confidence as the product of source confidences. Returns{node_iri: confidence}for the whole chain. Confidence is always clamped to [0.0, 1.0].counterfactual(store, remove_iri, activity_iri)— estimate what would change if a source node were removed. Returns a list of affected activities and their dependency paths.format_explanation(explanation, format="text"|"markdown"|"json")— render an Explanation in different output formats.
2. Provenance graph traversal strategy¶
Explanations are derived entirely from existing PROV-O triples:
prov:used— direct data dependencies of an activityprov:wasInformedBy— inter-activity information flow (agent step chains, planner sub-activities)prov:wasDerivedFrom— entity-level derivation (output X was derived from input Y)prov:wasAssociatedWith— which principal/agent performed the activityprov:wasGeneratedBy— which activity produced an entity
The traversal is bounded: a configurable max_depth (default 10)
prevents runaway walks on deeply nested provenance chains. Cycles are
detected and broken.
3. Confidence propagation model¶
Confidence propagates multiplicatively along derivation chains:
When a node has no explicit confidence annotation, 1.0 is assumed
(full trust). The product is clamped to [0.0, 1.0]. This is a
conservative model: derived conclusions are never more confident than
their least-confident source.
Confidence annotations are stored as triples:
4. CLI integration¶
trails explain <activity_iri> # human-readable explanation
trails explain --citations <iri> # citation graph
trails explain --confidence <node_iri> # confidence propagation
trails explain --counterfactual <remove_iri> <activity_iri>
trails explain --format json <iri> # output as JSON
The explain subcommand is registered in the lazy CLI loader alongside
existing commands like prov and cred.
5. Integration with the agent runtime¶
Planners (ReAct, Plan-and-Execute, Reflexion) already record
prov:wasInformedBy links between step activities. The explain module
leverages these existing links — no changes to planner code are needed.
Future enhancement (not in scope for this ADR): planners could
automatically collect citations during tool calls by annotating
prov:used edges with the specific KG node IRIs read during each step.
Non-goals¶
- Not an AI explainability tool. This module explains provenance chains, not neural network internals. It does not generate natural-language explanations via LLM — it formats structured provenance data.
- No causal inference. Counterfactual queries identify dependency paths, not causal mechanisms. "What would change" means "which activities would lose a dependency", not "what would the output have been."
- No real-time explanation streaming. Explanations are computed post-hoc from the provenance graph, not during capability execution.
- No UI component. The explain module provides data structures and CLI output. Dashboard integration (ADR-0019) is a separate concern.
Dependencies¶
| ADR | Relationship |
|---|---|
| ADR-0009 (PROV-O) | Source of all provenance triples; explain reads from the prov graph |
| ADR-0017 (Context) | ctx.kg provides store access for provenance queries |
| ADR-0021 (Progressive Enhancement) | Explain works with any provenance level — from bare capabilities to full agent plans |
| ADR-0030 (Verifiable Credentials) | Future: AuditTrailCredential could package an Explanation as a VC |
Consequences¶
Positive¶
- Audit-grade transparency. Every agent output can be traced to its source data with a single function call — critical for GDPR, EU AI Act, and regulated industries.
- Zero LLM dependency. Pure SPARQL traversal over existing PROV-O triples. No cost, no latency, no model risk.
- Composable with existing primitives. Works with any capability that records provenance — no changes to existing handlers required.
- Bounded confidence model. Simple multiplicative propagation is easy to reason about and audit. More sophisticated models can be layered on top.
- Counterfactual analysis. "What if we removed this source?" is a unique capability for a KG framework — enables data quality impact assessment.
Negative¶
- SPARQL query cost. Deep provenance chains require recursive graph
traversal. Mitigated by:
max_depthbound, result caching for repeated queries on the same activity. - Confidence model simplicity. Multiplicative propagation is lossy —
it cannot express "this source contradicts that source" or
"confidence increases with corroboration." Mitigated by: the model
is a floor, not a ceiling; richer models can override
propagate_confidence. - Explanation completeness depends on provenance quality. If a
capability does not record
prov:usedfor its data reads, the explanation will be incomplete. Mitigated by: clear docs on provenance best practices; future automatic citation collection in planners.
Revisit conditions¶
- If a standard explanation ontology emerges (beyond PROV-O), consider alignment or adoption.
- If LLM-generated natural-language explanations are needed, add an
optional
explain_natural(ctx, activity_iri)that callsctx.llmwith the structured explanation as context. - If the multiplicative confidence model proves insufficient, design a pluggable confidence propagation strategy.