ADR-0036: Semantic Diff — Git-style Change Tracking for Knowledge Graphs¶
- Status: Accepted (2026-04-19)
- Date: 2026-04-18
Context¶
Knowledge graphs evolve over time — schemas are refined, data is ingested, agents modify triples. Today, the only way to inspect what changed in a Trails KG is to dump the entire store before and after a mutation and manually compare. This is:
- Unscalable. A store with millions of triples cannot be diffed by eyeballing two N-Triples files.
- Non-replayable. There is no way to generate a SPARQL UPDATE that replays a set of changes, making migration rollbacks and auditing painful.
- CI-unfriendly. Automated pipelines that validate "this migration added exactly N triples and removed none" have no structured output to assert against.
Every SQL migration tool ships a diff surface (Alembic, Flyway, Liquibase). No KG framework ships an equivalent. The closest is RDF Delta (Jena), which is a patch format, not a user-facing CLI.
Use cases:
- Schema migration validation (M17). Before/after a migration: what triples changed? Did the migration accidentally drop data?
- CI assertions. Assert that a capability invocation added exactly the expected triples and nothing else.
- Audit trail enhancement. PROV-O records who did what; semantic diff records the exact delta in a replayable format.
- Debugging. "Something broke after the last ingest" — diff the store against yesterday's snapshot to see exactly what changed.
- Named-graph isolation. Diff only the
prov:graph, or only the application data graph, or only thetrails:credentialsgraph.
Decision¶
1. trails.kg_diff module¶
A new module providing a KGDiff dataclass and functions to compute and
format differences between two KG states. No AI/LLM dependency.
2. KGDiff dataclass¶
@dataclasses.dataclass
class KGDiff:
added: list[tuple[str, str, str]] # (s, p, o) triples present in B but not A
removed: list[tuple[str, str, str]] # (s, p, o) triples present in A but not B
modified: list[dict[str, str]] # {iri, field, old, new} — same s+p, different o
summary: dict[str, int] # {added: N, removed: N, modified: N, unchanged: N}
modified is a convenience view: when both stores share a triple
(s, p, old_o) and (s, p, new_o) where s and p match but o
differs, the entry appears in both added/removed (as raw deltas)
and in modified (as a semantic "this field changed" view). This
mirrors how git diff shows both the raw hunk and the file-level
summary.
3. Core functions¶
-
diff_stores(store_a, store_b, *, graph: str | None = None) -> KGDiff— Compare two store instances. Uses SPARQL to enumerate triples. Whengraphis set, restricts comparison to that named graph. -
diff_snapshots(snapshot_a: str, snapshot_b: str) -> KGDiff— Compare two N-Triples serializations (strings). Parses line by line; no store needed. -
format_diff(diff: KGDiff, format: str = "table") -> str— Render the diff. "table": human-readable with+/-markers (likegit diff)."sparql":DELETE DATA { ... } ; INSERT DATA { ... }statements that replay the changes."json": machine-readable{"added": [...], "removed": [...], ...}for CI.
4. CLI integration¶
New subcommands on trails kg:
trails kg diff --snapshot <file> # compare current store against a saved .nt file
trails kg diff --format table|sparql|json # output format (default: table)
trails kg diff --graph <iri> # restrict to a named graph
trails kg snapshot [--output <file>] # save current state as .nt for later comparison
trails kg snapshot serializes the current store (or a named graph) as
N-Triples to stdout or --output. This file can be committed to version
control or stored alongside migration scripts for later diffing.
5. Named-graph awareness¶
All diff operations accept an optional graph parameter. When set:
diff_storesrestricts its SPARQL enumeration toGRAPH <iri> { ?s ?p ?o }.diff_snapshotsassumes both snapshots represent the same graph scope (the caller is responsible for exporting a single graph).
This allows per-graph diffs: diff only the provenance graph, only the credentials graph, or only the application data graph.
6. Integration with schema migrations (M17)¶
The migration runner (future ADR) will call diff_stores before and
after applying a migration, storing the KGDiff as part of the
migration record. This enables:
- Dry-run validation: "this migration would add 42 triples, remove 3".
- Rollback generation: the
sparqlformat output is a replayable inverse of the migration. - Audit: every migration has a machine-readable delta attached.
Non-goals¶
- Not a streaming change-data-capture system. Diff is point-in-time, not a continuous event stream. CDC is a separate concern (future ADR).
- No PROV-O-based temporal diff in v1.
diff_since(ctx, datetime)requires indexing PROV-O activities by time range, which depends on the provenance writer emittingprov:startedAtTimeconsistently. Deferred until the provenance surface stabilizes (M17+). - No visual diff UI. The CLI and JSON output serve developers and CI; a web UI for graph diffs is out of scope.
- No patch application. The
sparqlformat output is human-runnable viatrails kg query, but there is notrails kg apply-patchcommand yet.
Dependencies¶
| ADR | Relationship |
|---|---|
| ADR-0004 (Kernel store) | Store API for triple enumeration |
| ADR-0009 (PROV-O) | Future diff_since integration |
| ADR-0021 (Progressive enhancement) | Diff works with label-first and typed nodes alike |
Consequences¶
Positive¶
- Migration safety. Schema migrations gain a structured diff surface — developers see exactly what changed and can assert on it in CI.
- Debugging velocity. "What changed?" is answered in one command, not a manual N-Triples eyeball session.
- CI integration. JSON output enables programmatic assertions on graph deltas in test suites and pipelines.
- Replayability. SPARQL format output is a runnable migration script
— copy-paste into
trails kg queryto replay or reverse changes. - No new dependencies. Pure Python + existing Store SPARQL API.
Negative¶
- Memory pressure on large diffs. Enumerating all triples from two stores into Python sets is O(n) memory. Mitigated by: named-graph scoping (diff only what you need), and streaming comparison for snapshot mode (line-by-line parsing).
- No incremental diff. Each diff recomputes from scratch. Acceptable for the CLI/CI use case; a future CDC surface would handle incremental tracking.
Revisit conditions¶
- If PROV-O provenance timestamps become reliable and indexed, add
diff_sinceas a time-based diff surface. - If graph sizes routinely exceed 10M triples, consider a streaming diff algorithm that avoids materializing both sides in memory.
- If the migration runner (M17) needs transactional diff (diff inside
a write transaction), extend
diff_storesto accept a transaction handle.