Observability & Auditability
We make every AI action followable end-to-end and provable after the fact: one correlation id threading the whole chain, a database that is itself a queryable trace, and a tamper-evident audit log you can defend to a regulator.
Most AI systems are black boxes: an agent acts, something happens, and nobody can reconstruct why. We build the opposite. nuvio designs AI observability and LLM observability into the data model from day one — one correlation id threading the HTTP request, the job queue, every model call and every outbound provider call. The result is an AI audit trail you can query in plain SQL, per-action receipts showing model, tokens and cost, and a tamper-evident log that stands up to AI governance and AI compliance review.
One trace id threads the whole causal chain
The hard part of AI monitoring is correlation: a single user click can fan out into a queued job, several model calls and multiple third-party API calls, often on different threads minutes apart. We mint one trace id at the entry edge and propagate it to every downstream unit of work — the request, the job row, each agent run, each model span, each provider call. Child executions inherit the trace id but get their own request and span ids, so you keep both the whole chain and its tree structure. We honor incoming W3C trace context only from trusted infrastructure and generate fresh ids otherwise, so the lineage is real and not spoofable.
The database is a queryable trace
Live tracing UIs are useful but ephemeral. We make the database the durable source of truth for LLM observability: every chain table — jobs, agent runs, model calls, provider calls, request logs, error events and audit events — carries the trace id. Reconstructing any action is a single query: select where trace id equals. No sampling, no dropped spans, no waiting for an external collector. Because the ids are shaped to the W3C trace-context standard on purpose, the same data flows into OpenTelemetry spans later with zero remodeling — the tables stay the source of truth and a live tracing view becomes a layer on top rather than a rewrite.
Receipts: every agent action shows its work
An agent that acts without explaining itself is impossible to trust or to govern. We attach a receipt to every agent action — the model used, input and output tokens, computed cost, and a human-readable reasoning summary — derived from the agent run and its child model calls. Each run also records whether it cleared a self-critique gate; runs that fail route to retry or human review rather than shipping silently. This turns AI monitoring from log-scraping into a first-class product surface: an operator, an auditor or you can open any decision and see exactly what the model did, what it cost, and why it chose what it chose.
A tamper-evident audit trail for AI governance
For AI governance and AI compliance, knowing what happened is not enough — you have to prove the record was not altered. We implement the audit log as an append-only, hash-chained table: each row stores a hash of its own contents plus the previous row's hash, so any insertion, deletion or edit anywhere in the history breaks the chain and is detectable. It is the one table with no soft delete by design. Verification walks the chain end-to-end. Combined with the trace id on every audit event, you get an AI audit trail that ties each change back to the acting user or agent, the originating session, and the full causal chain that produced it.
Cost, tokens and retention you can actually run
Observability that bankrupts you on storage gets turned off, so we design for the write volume. High-write telemetry — model calls, provider calls, request logs, audit events — is time-partitioned with explicit retention windows tuned per table: short for request logs, longer for model and provider calls, longest for audit. Raw prompt and response bodies live in object storage behind a TTL; the database keeps references and truncated previews. A nightly rollup turns per-call rows into daily cost and token aggregates that power dashboards without scanning the raw tables. You get full-fidelity AI monitoring where it matters and a bounded, predictable bill.
- One W3C-shaped trace id minted at the entry edge and propagated across requests, jobs, agent runs, model calls and provider calls.
- Correlation carried through thread pools and virtual threads with a context-propagating executor — the easiest thing to get wrong, built as a first-class utility.
- Structured JSON logs where every line carries trace id, request id, span id, user, workspace and session, so logs and database rows join cleanly.
- Per-action receipts: model, input/output tokens, computed cost, reasoning summary, and a self-critique verdict.
- Append-only, hash-chained audit log with previous-hash linkage and end-to-end chain verification.
- Time-partitioned telemetry with per-table retention, object-storage offload for raw bodies, and nightly cost/token rollups.
- Any AI action — operator-initiated or autonomous — is reconstructable end-to-end from a single id, in plain SQL.
- A tamper-evident AI audit trail that ties every change to an actor, a session and a causal chain, ready for governance and compliance review.
- Real cost visibility: per-call, per-agent and daily token and spend, so AI usage is monitored and budgeted, not guessed.
Use cases
A customer disputes an action an agent took on their account. Support quotes the reference id from the error toast; you run one query on the trace id and see the full chain — request, job, model calls, provider calls, and the receipt that explains the call.
A regulator asks who changed what and whether the record can be trusted. You export the hash-chained audit log, walk the chain to prove no row was altered, and tie each event back to an actor and the model run that triggered it.
Spend is climbing and nobody knows why. Daily token and cost rollups, sliced by agent and model, surface the expensive path; per-call telemetry shows which prompts and which retries are driving it, so you optimize the right thing.
Common questions
Explore more capabilities
Grounding & Evaluation
We make language-model output trustworthy: grounded in real sources, checked claim by claim, and measured against a quality gate before anything ships.
↗09 — CapabilityHuman-in-the-Loop Design
We design AI systems where a human stays in control by construction — approval gates the model cannot route around, tunable autonomy per workflow, and a full record of who decided what.
↗02 — CapabilityAgent Runtimes & Orchestration
We build agent runtimes that run real work to completion — bounded tool loops, a durable job queue, and a receipt on every action — so autonomy stays accountable.
↗Building something that needs this?
Tell us what you're working on. The first call is always free.