Evidence · the audit trail

An audit trail for AI agents

An AI agent audit trail is a record of what an agent did that can be defended, not just read. Application logs tell you a request happened; an audit trail proves which agent invoked which capability, whether it was authorized, what the outcome was, and that the record has not been altered. CHP produces one automatically at the capability boundary — every governed action captured as a hash-chained, replayable evidence record.

By Capability Host Protocol · 2026-06-28

The distinction

A log records that something happened. An audit trail proves it.

When an AI agent does something consequential — moves money, denies a claim, dispatches a machine, changes a record — and someone later asks what happened and were you allowed to, scattered application logs are not an answer. An audit trail is a different artifact, with different guarantees.

Application logs

AI agent audit trail

Built for

Debugging the system

Being defended to a third party

Identity

Often implicit / missing

Which agent + principal, explicit

Authorization

Not recorded

Allowed/denied + the policy applied

Integrity

Mutable, rotatable

Hash-chained, tamper-evident

Completeness

Best-effort

Every governed action at the boundary

Built for

Application logs

Debugging the system

AI agent audit trail

Being defended to a third party

Identity

Application logs

Often implicit / missing

AI agent audit trail

Which agent + principal, explicit

Authorization

Application logs

Not recorded

AI agent audit trail

Allowed/denied + the policy applied

Integrity

Application logs

Mutable, rotatable

AI agent audit trail

Hash-chained, tamper-evident

Completeness

Application logs

Best-effort

AI agent audit trail

Every governed action at the boundary

Logs and evidence are different jobs — keep both.

What goes in it

The fields that make a record defensible.

Identity

Which agent and which principal initiated the action.

Capability

The named, versioned capability that was invoked.

Authorization

The allow/deny decision and the policy it was evaluated against.

Outcome

What actually happened — including denial as a first-class, recorded result.

Correlation

An id that ties multi-step, multi-host work into one trace.

Integrity

A hash chain so any later tampering with the record is detectable.

How CHP builds it

Captured at the capability boundary, not bolted on after.

CHP records evidence at the moment an action crosses from intent into effect — the capability boundary — so you get a complete trail from one integration instead of audit code scattered through your agent. The result is replayable: you can reconstruct exactly what the agent did and verify the record was not altered. This is, almost literally, chain of custody for agent actions.

Questions

What teams ask about agent audit trails.

What is an AI agent audit trail?

A durable, ordered record of the consequential actions an AI agent took — which agent, which capability, under what authorization, with what inputs and outcome — captured in a form that can be independently verified later. The defining property is defensibility: it is built to answer "what happened and prove it," not just to help you debug.

Why are application logs not an audit trail?

Logs are written by the application for the application — unstructured, scattered across services, mutable, and easy to drop or rotate away. They can tell you an error occurred; they cannot prove that a specific agent was authorized to take a specific action and that the record is complete and unaltered. Different job, different guarantees.

What has to be in the trail for it to hold up under scrutiny?

Identity (which agent/principal), the capability invoked, the authorization decision (allowed or denied, under which policy), the inputs and the outcome, a correlation id tying multi-step work together, and tamper-evidence (a hash chain) so any later alteration is detectable. CHP records exactly these as first-class fields.

Do I have to instrument every line of my agent code?

No. CHP captures evidence at the capability boundary — the moment an action crosses from intent into effect — so you record the actions that matter without scattering audit logic through your model or prompt code. One integration at the boundary, not a hundred log statements.

How is an audit trail different from observability/telemetry?

Telemetry (metrics, traces, OpenTelemetry) is built to help you understand and operate a system. An audit trail is built to be defended to a third party — an auditor, a regulator, a counterparty. They compose: keep your telemetry for operations, add an evidence layer for the actions you may have to prove.

See your agents' audit trail.

Discovery starts with the capabilities.txt standard; the audit trail is where CHP picks up. Capture every governed action as replayable evidence in one command.

Build it with us How it works