Observability & audit

Observability & Auditability

We make every AI action followable end-to-end and provable after the fact: one correlation id threading the whole chain, a database that is itself a queryable trace, and a tamper-evident audit log you can defend to a regulator.

one trace idcommit · deploy · alert

Most AI systems are black boxes: an agent acts, something happens, and nobody can reconstruct why. We build the opposite. nuvio designs AI observability and LLM observability into the data model from day one — one correlation id threading the HTTP request, the job queue, every model call and every outbound provider call. The result is an AI audit trail you can query in plain SQL, per-action receipts showing model, tokens and cost, and a tamper-evident log that stands up to AI governance and AI compliance review.

One trace id threads the whole causal chain

The hard part of AI monitoring is correlation: a single user click can fan out into a queued job, several model calls and multiple third-party API calls, often on different threads minutes apart. We mint one trace id at the entry edge and propagate it to every downstream unit of work — the request, the job row, each agent run, each model span, each provider call. Child executions inherit the trace id but get their own request and span ids, so you keep both the whole chain and its tree structure. We honor incoming W3C trace context only from trusted infrastructure and generate fresh ids otherwise, so the lineage is real and not spoofable.

The database is a queryable trace

Live tracing UIs are useful but ephemeral. We make the database the durable source of truth for LLM observability: every chain table — jobs, agent runs, model calls, provider calls, request logs, error events and audit events — carries the trace id. Reconstructing any action is a single query: select where trace id equals. No sampling, no dropped spans, no waiting for an external collector. Because the ids are shaped to the W3C trace-context standard on purpose, the same data flows into OpenTelemetry spans later with zero remodeling — the tables stay the source of truth and a live tracing view becomes a layer on top rather than a rewrite.

Receipts: every agent action shows its work

An agent that acts without explaining itself is impossible to trust or to govern. We attach a receipt to every agent action — the model used, input and output tokens, computed cost, and a human-readable reasoning summary — derived from the agent run and its child model calls. Each run also records whether it cleared a self-critique gate; runs that fail route to retry or human review rather than shipping silently. This turns AI monitoring from log-scraping into a first-class product surface: an operator, an auditor or you can open any decision and see exactly what the model did, what it cost, and why it chose what it chose.

A tamper-evident audit trail for AI governance

For AI governance and AI compliance, knowing what happened is not enough — you have to prove the record was not altered. We implement the audit log as an append-only, hash-chained table: each row stores a hash of its own contents plus the previous row's hash, so any insertion, deletion or edit anywhere in the history breaks the chain and is detectable. It is the one table with no soft delete by design. Verification walks the chain end-to-end. Combined with the trace id on every audit event, you get an AI audit trail that ties each change back to the acting user or agent, the originating session, and the full causal chain that produced it.

Cost, tokens and retention you can actually run

Observability that bankrupts you on storage gets turned off, so we design for the write volume. High-write telemetry — model calls, provider calls, request logs, audit events — is time-partitioned with explicit retention windows tuned per table: short for request logs, longer for model and provider calls, longest for audit. Raw prompt and response bodies live in object storage behind a TTL; the database keeps references and truncated previews. A nightly rollup turns per-call rows into daily cost and token aggregates that power dashboards without scanning the raw tables. You get full-fidelity AI monitoring where it matters and a bounded, predictable bill.

What this includes
  • One W3C-shaped trace id minted at the entry edge and propagated across requests, jobs, agent runs, model calls and provider calls.
  • Correlation carried through thread pools and virtual threads with a context-propagating executor — the easiest thing to get wrong, built as a first-class utility.
  • Structured JSON logs where every line carries trace id, request id, span id, user, workspace and session, so logs and database rows join cleanly.
  • Per-action receipts: model, input/output tokens, computed cost, reasoning summary, and a self-critique verdict.
  • Append-only, hash-chained audit log with previous-hash linkage and end-to-end chain verification.
  • Time-partitioned telemetry with per-table retention, object-storage offload for raw bodies, and nightly cost/token rollups.
What you get
  • Any AI action — operator-initiated or autonomous — is reconstructable end-to-end from a single id, in plain SQL.
  • A tamper-evident AI audit trail that ties every change to an actor, a session and a causal chain, ready for governance and compliance review.
  • Real cost visibility: per-call, per-agent and daily token and spend, so AI usage is monitored and budgeted, not guessed.
Where it fits

Use cases

Reconstruct an autonomous decision

A customer disputes an action an agent took on their account. Support quotes the reference id from the error toast; you run one query on the trace id and see the full chain — request, job, model calls, provider calls, and the receipt that explains the call.

Pass a compliance audit

A regulator asks who changed what and whether the record can be trusted. You export the hash-chained audit log, walk the chain to prove no row was altered, and tie each event back to an actor and the model run that triggered it.

Find where the AI budget goes

Spend is climbing and nobody knows why. Daily token and cost rollups, sliced by agent and model, surface the expensive path; per-call telemetry shows which prompts and which retries are driving it, so you optimize the right thing.

FAQ

Common questions

Standard monitoring tracks requests and errors. LLM observability adds the things that make AI hard to trust: which model ran, how many tokens it used, what it cost, why it decided what it did, and whether it passed its own self-critique. We capture all of that as structured telemetry correlated by one trace id, so an AI action is as followable as an HTTP request.

The AI audit trail is an append-only table where each row stores a hash of its contents plus the previous row's hash, forming a chain. Altering, inserting or deleting any row breaks every hash after it, which a verification pass detects immediately. There is no soft delete on this table by design — it is the durable, defensible record for AI governance and AI compliance.

No. We make the database the source of truth, so end-to-end traces are queryable without an external collector or sampling. Because the ids follow the W3C trace-context standard, you can add an OpenTelemetry export later as a live tracing view with no remodeling — the durable record and your AI monitoring stack stay independent of any one vendor.

Only if you keep everything forever. We time-partition high-write tables with per-table retention windows, push raw prompt and response bodies to object storage behind a TTL while keeping references and previews, and roll per-call rows up into daily aggregates. You get full AI observability where it matters and a bounded, predictable storage cost.

Building something that needs this?

Tell us what you're working on. The first call is always free.

Start a projectAll capabilities