ADR-0041: CRDT Graph Editing — Conflict-Free Concurrent Knowledge Graph Mutations¶
- Status: Accepted
- Date: 2026-04-18
- Supersedes: —
- Superseded by: —
Context¶
Trails targets multi-agent workflows, offline-first apps, and federated sync scenarios. All three require concurrent writes to the same knowledge graph from multiple replicas (agents, users, edge devices) that may be disconnected or racing. Classic locking (pessimistic concurrency) does not scale to these use-cases: network partitions, variable latency, and agent autonomy mean writers cannot coordinate through a central lock manager.
CRDTs (Conflict-free Replicated Data Types) solve this by making merge associative, commutative, and idempotent — any two replicas that have seen the same set of operations converge to the same state, regardless of message ordering or duplication.
RDF triple stores are a natural fit: an RDF dataset is fundamentally a
set of (subject, predicate, object, graph) quads. A set with add and
remove is the textbook OR-Set / LWW-Element-Set CRDT.
Options considered¶
-
Pessimistic locking (SPARQL UPDATE + transactions). Already available via
ctx.kg.transaction(). Works for single-process, single-store scenarios. Breaks under federation, offline agents, or multi-replica topologies. -
Operational Transformation (OT). Requires a central server to linearise operations. Incompatible with peer-to-peer federation (ADR-0023) and offline-first goals.
-
State-based CRDT (this ADR). Each replica maintains local state (add-set + remove-set with Lamport timestamps). Merge is a pure function of two states. Add-wins semantics: when a triple appears in both the add-set and remove-set, the entry with the highest timestamp wins. Ties broken by replica ID (lexicographic). No central coordinator needed.
-
Operation-based CRDT (op-based / commutative). Lighter on the wire (only deltas) but requires reliable causal broadcast — a stronger transport guarantee than Trails can assume in federated meshes. State-based is more robust and easier to reason about.
Decision¶
Trails adopts a state-based LWW-Element-Set CRDT for concurrent
graph mutations. The implementation lives in trails.crdt and is
opt-in: existing ctx.kg writes are unaffected. The CRDT layer wraps
a regular Store and intercepts add/remove operations.
Core semantics¶
-
Triple identity: A triple is identified by the tuple
(subject, predicate, object, graph)wheregraphdefaults to the default graph when omitted. -
Lamport clock: Each replica maintains a Lamport timestamp that increments on every local operation and updates on merge (
max(local, remote) + 1). This provides a causal ordering without requiring synchronized wall clocks. -
Add-wins resolution: When merging, if a triple exists in both the add-set and remove-set across replicas, the entry with the higher timestamp wins. This means a concurrent add always beats a concurrent remove — a deliberate choice for knowledge graphs where data loss is worse than data duplication.
-
State export/import:
CRDTStateis a serializable snapshot of a replica's add-set, remove-set, and clock. Replicas exchange states to synchronize. The merge function is idempotent: applying the same state twice is a no-op.
Integration points¶
CRDTStore(replica_id)wraps any TrailsStoreinstance.crdt_context(ctx, replica_id)is a context manager that transparently captures allctx.kgwrites as CRDT operations within a block.merge_replicas(a, b)produces a new merged replica — useful for federation sync (ADR-0023) and multi-agent merge points.resolve_conflicts(ops_a, ops_b)is a lower-level utility for operation-list merging.
What this does NOT change¶
- Existing
ctx.kg.add()/ctx.kg.save()/ctx.kg.update()remain unchanged. CRDT is additive, not a replacement. - Provenance (ADR-0009) is unaffected: provenance triples attach at the capability boundary, orthogonal to CRDT merge.
- SHACL validation (ADR-0002) runs on the merged result, not on individual CRDT operations.
Consequences¶
- Multi-agent convergence: Two agents editing the same graph independently will converge to the same state after exchanging states, without manual conflict resolution.
- Offline-first: Agents can work disconnected and sync later. Merge is automatic and deterministic.
- Federation: State exchange over the federation mesh (ADR-0023) gets a principled merge strategy instead of last-write-wins at the HTTP level.
- Storage overhead: Each triple carries a timestamp and exists in either the add-set or remove-set (tombstones). For large graphs this is non-trivial; garbage collection of old tombstones is a future optimization.
- No real-time co-editing: This is eventual consistency, not real-time collaboration. Latency between state exchanges determines how "eventually" replicas converge.
References¶
- Shapiro et al., "A comprehensive study of Convergent and Commutative Replicated Data Types" (2011) — OR-Set, LWW-Element-Set
- ADR-0023 (SPARQL Federation / Instance Mesh) — transport layer for state exchange
- ADR-0009 (Provenance Always-On) — orthogonal, unaffected
- ADR-0017 (ActiveGraph ORM) — the ORM surface this wraps