ADR-0036: Semantic Diff — Git-style Change Tracking for Knowledge Graphs¶

Status: Accepted (2026-04-19)
Date: 2026-04-18

Context¶

Knowledge graphs evolve over time — schemas are refined, data is ingested, agents modify triples. Today, the only way to inspect what changed in a Trails KG is to dump the entire store before and after a mutation and manually compare. This is:

Unscalable. A store with millions of triples cannot be diffed by eyeballing two N-Triples files.
Non-replayable. There is no way to generate a SPARQL UPDATE that replays a set of changes, making migration rollbacks and auditing painful.
CI-unfriendly. Automated pipelines that validate "this migration added exactly N triples and removed none" have no structured output to assert against.

Every SQL migration tool ships a diff surface (Alembic, Flyway, Liquibase). No KG framework ships an equivalent. The closest is RDF Delta (Jena), which is a patch format, not a user-facing CLI.

Use cases:

Schema migration validation (M17). Before/after a migration: what triples changed? Did the migration accidentally drop data?
CI assertions. Assert that a capability invocation added exactly the expected triples and nothing else.
Audit trail enhancement. PROV-O records who did what; semantic diff records the exact delta in a replayable format.
Debugging. "Something broke after the last ingest" — diff the store against yesterday's snapshot to see exactly what changed.
Named-graph isolation. Diff only the prov: graph, or only the application data graph, or only the trails:credentials graph.

Decision¶

1. `trails.kg_diff` module¶

A new module providing a KGDiff dataclass and functions to compute and format differences between two KG states. No AI/LLM dependency.

from trails.kg_diff import (
    KGDiff,
    diff_stores,
    diff_snapshots,
    format_diff,
)

2. `KGDiff` dataclass¶

@dataclasses.dataclass
class KGDiff:
    added: list[tuple[str, str, str]]       # (s, p, o) triples present in B but not A
    removed: list[tuple[str, str, str]]      # (s, p, o) triples present in A but not B
    modified: list[dict[str, str]]           # {iri, field, old, new} — same s+p, different o
    summary: dict[str, int]                  # {added: N, removed: N, modified: N, unchanged: N}

modified is a convenience view: when both stores share a triple (s, p, old_o) and (s, p, new_o) where s and p match but o differs, the entry appears in both added/removed (as raw deltas) and in modified (as a semantic "this field changed" view). This mirrors how git diff shows both the raw hunk and the file-level summary.

3. Core functions¶

diff_stores(store_a, store_b, *, graph: str | None = None) -> KGDiff — Compare two store instances. Uses SPARQL to enumerate triples. When graph is set, restricts comparison to that named graph.
diff_snapshots(snapshot_a: str, snapshot_b: str) -> KGDiff — Compare two N-Triples serializations (strings). Parses line by line; no store needed.
format_diff(diff: KGDiff, format: str = "table") -> str — Render the diff.
"table": human-readable with +/- markers (like git diff).
"sparql": DELETE DATA { ... } ; INSERT DATA { ... } statements that replay the changes.
"json": machine-readable {"added": [...], "removed": [...], ...} for CI.

4. CLI integration¶

New subcommands on trails kg:

trails kg diff --snapshot <file>          # compare current store against a saved .nt file
trails kg diff --format table|sparql|json # output format (default: table)
trails kg diff --graph <iri>              # restrict to a named graph
trails kg snapshot [--output <file>]      # save current state as .nt for later comparison

trails kg snapshot serializes the current store (or a named graph) as N-Triples to stdout or --output. This file can be committed to version control or stored alongside migration scripts for later diffing.

5. Named-graph awareness¶

All diff operations accept an optional graph parameter. When set:

diff_stores restricts its SPARQL enumeration to GRAPH <iri> { ?s ?p ?o }.
diff_snapshots assumes both snapshots represent the same graph scope (the caller is responsible for exporting a single graph).

This allows per-graph diffs: diff only the provenance graph, only the credentials graph, or only the application data graph.

6. Integration with schema migrations (M17)¶

The migration runner (future ADR) will call diff_stores before and after applying a migration, storing the KGDiff as part of the migration record. This enables:

Dry-run validation: "this migration would add 42 triples, remove 3".
Rollback generation: the sparql format output is a replayable inverse of the migration.
Audit: every migration has a machine-readable delta attached.

Non-goals¶

Not a streaming change-data-capture system. Diff is point-in-time, not a continuous event stream. CDC is a separate concern (future ADR).
No PROV-O-based temporal diff in v1. diff_since(ctx, datetime) requires indexing PROV-O activities by time range, which depends on the provenance writer emitting prov:startedAtTime consistently. Deferred until the provenance surface stabilizes (M17+).
No visual diff UI. The CLI and JSON output serve developers and CI; a web UI for graph diffs is out of scope.
No patch application. The sparql format output is human-runnable via trails kg query, but there is no trails kg apply-patch command yet.

Dependencies¶

ADR	Relationship
ADR-0004 (Kernel store)	Store API for triple enumeration
ADR-0009 (PROV-O)	Future `diff_since` integration
ADR-0021 (Progressive enhancement)	Diff works with label-first and typed nodes alike

Consequences¶

Positive¶

Migration safety. Schema migrations gain a structured diff surface — developers see exactly what changed and can assert on it in CI.
Debugging velocity. "What changed?" is answered in one command, not a manual N-Triples eyeball session.
CI integration. JSON output enables programmatic assertions on graph deltas in test suites and pipelines.
Replayability. SPARQL format output is a runnable migration script — copy-paste into trails kg query to replay or reverse changes.
No new dependencies. Pure Python + existing Store SPARQL API.

Negative¶

Memory pressure on large diffs. Enumerating all triples from two stores into Python sets is O(n) memory. Mitigated by: named-graph scoping (diff only what you need), and streaming comparison for snapshot mode (line-by-line parsing).
No incremental diff. Each diff recomputes from scratch. Acceptable for the CLI/CI use case; a future CDC surface would handle incremental tracking.

Revisit conditions¶

If PROV-O provenance timestamps become reliable and indexed, add diff_since as a time-based diff surface.
If graph sizes routinely exceed 10M triples, consider a streaming diff algorithm that avoids materializing both sides in memory.
If the migration runner (M17) needs transactional diff (diff inside a write transaction), extend diff_stores to accept a transaction handle.