Skip to content

Working with Data

Overview

This chapter covers everything you need to create, query, update, and delete data in a Trails application. You will learn the ORM surface (the high-level, Django-style API), the label-first KG surface (for flexible, untyped data), and the SPARQL escape hatch (for when you need full query power).

Learning Objectives

After this chapter you will be able to:

  • Create, read, update, and delete nodes using @app.model (GaC) or @node_type and the ORM
  • Build queries with filters, ordering, limits, and aggregates
  • Use property paths to traverse relationships
  • Compose complex filters with Q objects
  • Project results with .values() and .values_list()
  • Drop to raw SPARQL when the ORM is not enough
  • Use label-first ctx.kg methods for untyped data

Creating and Querying Nodes

Creating nodes

Define a node type and create instances:

from typing import Annotated
from trails import App, capability
from trails.gac import required, optional, min_length, min_value

app = App("myapp")

@app.model
class Article:
    title:    Annotated[str, required(), min_length(1)]
    body:     Annotated[str, required()]
    priority: Annotated[int, optional(), min_value(0)]
    tags:     list[str]

@capability
def create_article(ctx, title: str, body: str, priority: int = 0) -> dict:
    article = Article(title=title, body=body, priority=priority, tags=["draft"])
    ctx.kg.add(article)
    return {"id": article.id, "title": article.title}

The Annotated[] constraints are validated on every write. The explicit @node_type("Name", fields={...}) form still works and is documented in the ORM guide.

Article(...) validates field types at construction time. Unknown kwargs raise an error. Missing scalar fields raise an error. List fields default to []. The id attribute is a UUIDv7 IRI, available immediately after construction.

ctx.kg.add(article) persists the instance. The write goes through the kernel store, which emits provenance and applies any SHACL constraints.

Querying nodes

Fetch all instances:

@capability
def list_articles(ctx) -> list:
    articles = Article.where().fetch(ctx)
    return [{"id": a.id, "title": a.title} for a in articles]

Fetch a single instance by IRI:

@capability
def get_article(ctx, iri: str) -> dict:
    article = Article.find(ctx, iri)
    return {"id": article.id, "title": article.title, "body": article.body}

Updating nodes

Modify an instance and save:

@capability
def update_title(ctx, iri: str, new_title: str) -> dict:
    article = Article.find(ctx, iri)
    article.title = new_title
    article.save(ctx)
    return {"id": article.id, "title": article.title}

save(ctx) is an upsert with dirty tracking -- only changed fields are written. If the node does not exist, it is created.

Deleting nodes

@capability
def delete_article(ctx, iri: str) -> dict:
    Article.delete(ctx, iri)
    return {"deleted": iri}

Bulk operations

Create many nodes at once (much faster than individual ctx.kg.add calls):

@capability
def import_articles(ctx, items: list) -> dict:
    articles = [Article(title=i["title"], body=i["body"], priority=0) for i in items]
    Article.bulk_create(ctx, articles)
    return {"imported": len(articles)}

Count without fetching:

@capability
def count_articles(ctx) -> dict:
    return {"count": Article.count(ctx)}

Delete all instances:

Article.delete_all(ctx)

The Query Builder

Model.where(...) returns a QueryBuilder that compiles to SPARQL. Chain methods to build up your query before calling .fetch(ctx) to execute it.

Basic filters

# Exact match
Article.where(title="Hello World").fetch(ctx)

# Numeric comparisons
Article.where(priority__gte=3).fetch(ctx)       # priority >= 3
Article.where(priority__lt=10).fetch(ctx)        # priority < 10

# String matching
Article.where(title__contains="blog").fetch(ctx)         # case-sensitive
Article.where(title__icontains="blog").fetch(ctx)        # case-insensitive
Article.where(title__startswith="How to").fetch(ctx)

# Set membership
Article.where(priority__in=[1, 2, 3]).fetch(ctx)

Filter reference

Suffix Operator Example
(none) = title="Hello"
__gte >= priority__gte=3
__gt > priority__gt=3
__lte <= priority__lte=10
__lt < priority__lt=10
__in IN (...) status__in=["draft", "published"]
__contains CONTAINS title__contains="blog"
__icontains case-insensitive CONTAINS title__icontains="blog"
__startswith STRSTARTS title__startswith="How"

Property Paths

Double-underscore chains traverse reference fields in one SPARQL query. No N+1 problem, no manual joins.

@node_type("Author", fields={"name": str})
class Author: ...

@node_type("Article", fields={"title": str, "author": Author})
class Article: ...

# 1-hop: articles by author name
Article.where(author__name="Alice").fetch(ctx)

# 2-hop: articles by authors in a specific org
@node_type("Org", fields={"name": str})
class Org: ...

@node_type("Author", fields={"name": str, "org": Org})
class Author: ...

Article.where(author__org__name="Acme").fetch(ctx)

Each __ hop adds one triple pattern to the SPARQL BGP. The query resolves entirely server-side -- no round trips.


Q Combinators

For queries that go beyond simple AND filters, use Q objects. They compose with | (OR), & (AND), and ~ (NOT):

from trails.orm import Q

# OR: title or body contains "graph"
Article.where(
    Q(title__icontains="graph") | Q(body__icontains="graph")
).fetch(ctx)

# NOT + AND: not archived, priority >= 3
Article.where(
    ~Q(status="archived") & Q(priority__gte=3)
).fetch(ctx)

# Complex composition
Article.where(
    (Q(priority__gte=5) | Q(tags__contains="urgent"))
    & ~Q(status="archived")
).fetch(ctx)

Q objects generate SPARQL FILTER expressions with correct parenthesization. Nest them as deeply as you need.


Projections and Aggregations

.values() -- dict projection

Return plain dicts instead of hydrated model instances. Useful when you only need a few fields:

@capability
def article_titles(ctx) -> list:
    return Article.where().values("title", "priority").fetch(ctx)
    # [{"id": "trails://...", "title": "Hello", "priority": 3}, ...]

.values_list() -- tuple projection

Return flat tuples:

titles = Article.where().values_list("title", flat=True).fetch(ctx)
# ["Hello", "World", ...]

.exists() -- existence check

has_articles = Article.where(priority__gte=5).exists(ctx)
# True or False -- stops at the first match

.distinct() -- unique results

unique_authors = Article.where().values("author").distinct().fetch(ctx)

Annotations (aggregates)

from trails.orm import Count, Sum, Avg, Min, Max

# Count articles per priority level
Article.where().annotate(total=Count("id")).fetch(ctx)

# Aggregate statistics
Article.where().annotate(
    avg_priority=Avg("priority"),
    max_priority=Max("priority"),
    total=Count("id"),
).fetch(ctx)

The SPARQL Escape Hatch

The ORM covers the common cases. For everything else, drop to raw SPARQL through ctx.kg:

Read queries

@capability
def custom_query(ctx) -> list:
    rows = ctx.kg.query("""
        SELECT ?title (COUNT(?tag) AS ?tag_count)
        WHERE {
            ?article a <trails://local/Article> ;
                     <trails://local/prop/title> ?title ;
                     <trails://local/prop/tags> ?tag .
        }
        GROUP BY ?title
        ORDER BY DESC(?tag_count)
        LIMIT 10
    """)
    return [{"title": r["title"], "tags": r["tag_count"]} for r in rows]

Write queries

@capability
def custom_update(ctx) -> dict:
    ctx.kg.update("""
        DELETE { ?article <trails://local/prop/status> ?old }
        INSERT { ?article <trails://local/prop/status> "published" }
        WHERE {
            ?article a <trails://local/Article> ;
                     <trails://local/prop/status> ?old .
            FILTER(?old = "draft")
        }
    """)
    return {"updated": True}

ASK queries

exists = ctx.kg.query("ASK { ?s a <trails://local/Article> }")
# [{"_boolean": True}] or [{"_boolean": False}]

Label-first KG

Not everything needs a @node_type. For exploratory data, event logs, or data from external sources with unpredictable schemas, use the label-first surface directly:

Creating nodes

@capability
def bookmark(ctx, url: str, tags: list) -> dict:
    iri = ctx.kg.node(
        labels=["Bookmark"],
        properties={"url": url, "tags": tags, "saved_at": "2025-01-15"},
    )
    return {"id": iri}

ctx.kg.node() creates a node with labels and properties, no type declaration needed. Labels become rdf:type triples; properties become predicate-object triples.

Creating edges

@capability
def tag_bookmark(ctx, bookmark_iri: str, tag_iri: str) -> dict:
    ctx.kg.edge(subject=bookmark_iri, label="tagged_with", object=tag_iri)
    return {"linked": True}

Querying

@capability
def find_bookmarks(ctx, url: str) -> list:
    return ctx.kg.match(
        labels=["Bookmark"],
        where={"url": url},
    )

Traversing

@capability
def bookmark_tags(ctx, bookmark_iri: str) -> list:
    return ctx.kg.traverse(subject=bookmark_iri, label="tagged_with")

When to use which surface

Use @node_type + ORM when... Use ctx.kg label-first when...
Data has a stable shape Schema is unknown or evolving
You want validation on writes You want maximum flexibility
You need filter suffixes (__gte, __in) You query by label/property match
You want hydrated Python objects Plain dicts are fine
Domain entities (Patient, Invoice) Event logs, scrapes, observations

Both surfaces coexist on the same ctx.kg handle. You can use @node_type for your domain entities and ctx.kg.node for everything else, in the same capability.


What's Next

You can now create, query, and manage data in Trails. The next step is securing that data:

Trust, Policy, and Identity -- Add Cedar policies for access control, DID identity for principals, cost envelopes for budget enforcement, and provenance for auditability.

For deep dives:

  • ORM guide -- inheritance, async, compound constraints, dirty tracking
  • KG guide -- IRI namespacing, multi-label, edge queries
  • Ingestion guide -- PDF/HTML/Markdown extractors, chunking pipeline
  • Vector guide -- embeddings, hybrid SPARQL+vector retrieval