Skip to content

ADR-0003: Hybrid IRI minting strategy

  • Status: Accepted
  • Date: 2026-04-12

Context

Every entity in Trails has an IRI. The IRI-minting strategy determines whether identifiers are stable, reproducible, cacheable, and whether they leak information. Options:

  1. Auto-increment integers (myapp:patient/1, myapp:patient/2). Rails-style. Leaks count / enumeration. Not content-addressed. Not ideal for distributed systems.
  2. UUIDv4 — fully random, no ordering. Good privacy, poor for indexes (random disk access).
  3. UUIDv7 — time-ordered, random, no PII, sortable. Good for indexes. No content addressing.
  4. Content-addressed (hash of canonical form). Immutable data natural fit; perfect dedup; perfect cache keys. Useless for mutable entities (IRI changes on every edit).
  5. User-specified — application code mints its own IRIs. Flexibility, but no framework-level guarantees.

Decision

Hybrid, convention-driven by resource trait, configurable in trails.toml:

Resource trait Strategy Example
Mutable, identity-stable (users, patients, orgs) UUIDv7 myapp:patient/018f1cbf-a7e2-…
Immutable, content-defined (observations, measurements, events, documents) Content-addressed (hash(canonical_form)) myapp:obs/sha256-abc123…
Externally-sourced (imported entities) Preserve source IRI pubmed:12345, orcid:0000-0001-…
User-specified Respect it whatever app passes

Per-shape configuration in trails.toml:

[iri.Patient]
strategy = "uuidv7"

[iri.Observation]
strategy = "content"
fields = ["subject", "timestamp", "value"]

[iri.Publication]
strategy = "preserve"

Default for an undeclared shape: UUIDv7.

Consequences

Positive

  • Content-addressed immutable data gives free dedup, cache keys, and integrity checks.
  • UUIDv7 for mutable entities avoids count leakage while keeping index locality.
  • Preserving source IRIs prevents identity duplication across systems.
  • Per-shape override lets apps deviate without framework surgery.

Negative

  • Complexity — three minting strategies instead of one. Mitigated by sensible default (UUIDv7) and per-shape opt-in.
  • Content-addressed IRIs are opaque — no semantic information in the IRI string. Not a problem if URI dereferencing is the access pattern (which it is in the semantic web).
  • Changing fields in a content-addressed config changes all subsequently-minted IRIs — developers must understand that content-addressing is a commitment.

Non-consequences

  • SPARQL queries work identically regardless of strategy (IRIs are opaque at query time).
  • Federation works identically — external systems see IRIs; they don't care how those were minted.

Revisit conditions

  • If UUIDv7 library support in the Rust ecosystem proves unstable, fall back to UUIDv4 or ULID.
  • If content-addressing causes write-amplification issues at scale, introduce a content-address sidecar table with stable UUID primary keys.