Working with Data¶
Overview¶
This chapter covers everything you need to create, query, update, and delete data in a Trails application. You will learn the ORM surface (the high-level, Django-style API), the label-first KG surface (for flexible, untyped data), and the SPARQL escape hatch (for when you need full query power).
Learning Objectives¶
After this chapter you will be able to:
- Create, read, update, and delete nodes using
@app.model(GaC) or@node_typeand the ORM - Build queries with filters, ordering, limits, and aggregates
- Use property paths to traverse relationships
- Compose complex filters with
Qobjects - Project results with
.values()and.values_list() - Drop to raw SPARQL when the ORM is not enough
- Use label-first
ctx.kgmethods for untyped data
Creating and Querying Nodes¶
Creating nodes¶
Define a node type and create instances:
from typing import Annotated
from trails import App, capability
from trails.gac import required, optional, min_length, min_value
app = App("myapp")
@app.model
class Article:
title: Annotated[str, required(), min_length(1)]
body: Annotated[str, required()]
priority: Annotated[int, optional(), min_value(0)]
tags: list[str]
@capability
def create_article(ctx, title: str, body: str, priority: int = 0) -> dict:
article = Article(title=title, body=body, priority=priority, tags=["draft"])
ctx.kg.add(article)
return {"id": article.id, "title": article.title}
The Annotated[] constraints are validated on every write. The explicit
@node_type("Name", fields={...}) form still works and is documented in
the ORM guide.
Article(...) validates field types at construction time. Unknown
kwargs raise an error. Missing scalar fields raise an error. List
fields default to []. The id attribute is a UUIDv7 IRI, available
immediately after construction.
ctx.kg.add(article) persists the instance. The write goes through
the kernel store, which emits provenance and applies any SHACL
constraints.
Querying nodes¶
Fetch all instances:
@capability
def list_articles(ctx) -> list:
articles = Article.where().fetch(ctx)
return [{"id": a.id, "title": a.title} for a in articles]
Fetch a single instance by IRI:
@capability
def get_article(ctx, iri: str) -> dict:
article = Article.find(ctx, iri)
return {"id": article.id, "title": article.title, "body": article.body}
Updating nodes¶
Modify an instance and save:
@capability
def update_title(ctx, iri: str, new_title: str) -> dict:
article = Article.find(ctx, iri)
article.title = new_title
article.save(ctx)
return {"id": article.id, "title": article.title}
save(ctx) is an upsert with dirty tracking -- only changed fields are
written. If the node does not exist, it is created.
Deleting nodes¶
@capability
def delete_article(ctx, iri: str) -> dict:
Article.delete(ctx, iri)
return {"deleted": iri}
Bulk operations¶
Create many nodes at once (much faster than individual ctx.kg.add
calls):
@capability
def import_articles(ctx, items: list) -> dict:
articles = [Article(title=i["title"], body=i["body"], priority=0) for i in items]
Article.bulk_create(ctx, articles)
return {"imported": len(articles)}
Count without fetching:
Delete all instances:
The Query Builder¶
Model.where(...) returns a QueryBuilder that compiles to SPARQL.
Chain methods to build up your query before calling .fetch(ctx) to
execute it.
Basic filters¶
# Exact match
Article.where(title="Hello World").fetch(ctx)
# Numeric comparisons
Article.where(priority__gte=3).fetch(ctx) # priority >= 3
Article.where(priority__lt=10).fetch(ctx) # priority < 10
# String matching
Article.where(title__contains="blog").fetch(ctx) # case-sensitive
Article.where(title__icontains="blog").fetch(ctx) # case-insensitive
Article.where(title__startswith="How to").fetch(ctx)
# Set membership
Article.where(priority__in=[1, 2, 3]).fetch(ctx)
Filter reference¶
| Suffix | Operator | Example |
|---|---|---|
| (none) | = |
title="Hello" |
__gte |
>= |
priority__gte=3 |
__gt |
> |
priority__gt=3 |
__lte |
<= |
priority__lte=10 |
__lt |
< |
priority__lt=10 |
__in |
IN (...) |
status__in=["draft", "published"] |
__contains |
CONTAINS |
title__contains="blog" |
__icontains |
case-insensitive CONTAINS |
title__icontains="blog" |
__startswith |
STRSTARTS |
title__startswith="How" |
Property Paths¶
Double-underscore chains traverse reference fields in one SPARQL query. No N+1 problem, no manual joins.
@node_type("Author", fields={"name": str})
class Author: ...
@node_type("Article", fields={"title": str, "author": Author})
class Article: ...
# 1-hop: articles by author name
Article.where(author__name="Alice").fetch(ctx)
# 2-hop: articles by authors in a specific org
@node_type("Org", fields={"name": str})
class Org: ...
@node_type("Author", fields={"name": str, "org": Org})
class Author: ...
Article.where(author__org__name="Acme").fetch(ctx)
Each __ hop adds one triple pattern to the SPARQL BGP. The query
resolves entirely server-side -- no round trips.
Q Combinators¶
For queries that go beyond simple AND filters, use Q objects.
They compose with | (OR), & (AND), and ~ (NOT):
from trails.orm import Q
# OR: title or body contains "graph"
Article.where(
Q(title__icontains="graph") | Q(body__icontains="graph")
).fetch(ctx)
# NOT + AND: not archived, priority >= 3
Article.where(
~Q(status="archived") & Q(priority__gte=3)
).fetch(ctx)
# Complex composition
Article.where(
(Q(priority__gte=5) | Q(tags__contains="urgent"))
& ~Q(status="archived")
).fetch(ctx)
Q objects generate SPARQL FILTER expressions with correct
parenthesization. Nest them as deeply as you need.
Projections and Aggregations¶
.values() -- dict projection¶
Return plain dicts instead of hydrated model instances. Useful when you only need a few fields:
@capability
def article_titles(ctx) -> list:
return Article.where().values("title", "priority").fetch(ctx)
# [{"id": "trails://...", "title": "Hello", "priority": 3}, ...]
.values_list() -- tuple projection¶
Return flat tuples:
.exists() -- existence check¶
has_articles = Article.where(priority__gte=5).exists(ctx)
# True or False -- stops at the first match
.distinct() -- unique results¶
Annotations (aggregates)¶
from trails.orm import Count, Sum, Avg, Min, Max
# Count articles per priority level
Article.where().annotate(total=Count("id")).fetch(ctx)
# Aggregate statistics
Article.where().annotate(
avg_priority=Avg("priority"),
max_priority=Max("priority"),
total=Count("id"),
).fetch(ctx)
The SPARQL Escape Hatch¶
The ORM covers the common cases. For everything else, drop to raw
SPARQL through ctx.kg:
Read queries¶
@capability
def custom_query(ctx) -> list:
rows = ctx.kg.query("""
SELECT ?title (COUNT(?tag) AS ?tag_count)
WHERE {
?article a <trails://local/Article> ;
<trails://local/prop/title> ?title ;
<trails://local/prop/tags> ?tag .
}
GROUP BY ?title
ORDER BY DESC(?tag_count)
LIMIT 10
""")
return [{"title": r["title"], "tags": r["tag_count"]} for r in rows]
Write queries¶
@capability
def custom_update(ctx) -> dict:
ctx.kg.update("""
DELETE { ?article <trails://local/prop/status> ?old }
INSERT { ?article <trails://local/prop/status> "published" }
WHERE {
?article a <trails://local/Article> ;
<trails://local/prop/status> ?old .
FILTER(?old = "draft")
}
""")
return {"updated": True}
ASK queries¶
exists = ctx.kg.query("ASK { ?s a <trails://local/Article> }")
# [{"_boolean": True}] or [{"_boolean": False}]
Label-first KG¶
Not everything needs a @node_type. For exploratory data, event logs,
or data from external sources with unpredictable schemas, use the
label-first surface directly:
Creating nodes¶
@capability
def bookmark(ctx, url: str, tags: list) -> dict:
iri = ctx.kg.node(
labels=["Bookmark"],
properties={"url": url, "tags": tags, "saved_at": "2025-01-15"},
)
return {"id": iri}
ctx.kg.node() creates a node with labels and properties, no type
declaration needed. Labels become rdf:type triples; properties become
predicate-object triples.
Creating edges¶
@capability
def tag_bookmark(ctx, bookmark_iri: str, tag_iri: str) -> dict:
ctx.kg.edge(subject=bookmark_iri, label="tagged_with", object=tag_iri)
return {"linked": True}
Querying¶
@capability
def find_bookmarks(ctx, url: str) -> list:
return ctx.kg.match(
labels=["Bookmark"],
where={"url": url},
)
Traversing¶
@capability
def bookmark_tags(ctx, bookmark_iri: str) -> list:
return ctx.kg.traverse(subject=bookmark_iri, label="tagged_with")
When to use which surface¶
Use @node_type + ORM when... |
Use ctx.kg label-first when... |
|---|---|
| Data has a stable shape | Schema is unknown or evolving |
| You want validation on writes | You want maximum flexibility |
You need filter suffixes (__gte, __in) |
You query by label/property match |
| You want hydrated Python objects | Plain dicts are fine |
| Domain entities (Patient, Invoice) | Event logs, scrapes, observations |
Both surfaces coexist on the same ctx.kg handle. You can use
@node_type for your domain entities and ctx.kg.node for everything
else, in the same capability.
What's Next¶
You can now create, query, and manage data in Trails. The next step is securing that data:
Trust, Policy, and Identity -- Add Cedar policies for access control, DID identity for principals, cost envelopes for budget enforcement, and provenance for auditability.
For deep dives:
- ORM guide -- inheritance, async, compound constraints, dirty tracking
- KG guide -- IRI namespacing, multi-label, edge queries
- Ingestion guide -- PDF/HTML/Markdown extractors, chunking pipeline
- Vector guide -- embeddings, hybrid SPARQL+vector retrieval