When an agent sends an email on your behalf, updates a CRM record, or moves money, three questions have to have answers: who authorized it, when, and on what basis. If they don't, you don't have an autonomous function — you have an unaccountable one. The audit trail is what makes agent actions reviewable after the fact, and in Harnyss it isn't a feature bolted onto the side. It's a property of how every action runs.
Auditability can't depend on diligence
The tempting way to build an audit trail is to have each part of the system log what it thinks is worth logging, into whatever store is convenient, in whatever shape fits. You end up with a trail that records the actions someone remembered to instrument and silently omits the rest. For a productivity tool, a gap in the log is an annoyance. For software that acts on your behalf, a gap is a decision nobody can account for — and you won't know it's missing until you go looking for it during the one incident that matters.
The trail has to be complete by construction, not by diligence. That means recording an action can't be a separate step a feature author opts into. It has to be inseparable from taking the action at all.
Every action is already a logged event
This is where building the platform end to end on Model Context Protocol pays off in a way that isn't obvious until you see the second-order effect. Every action an agent takes — calling a tool, writing to a connected system, calling another agent — goes through the same protocol boundary. Operator actions go through that boundary too. Because every action has the same shape, the harness records every action the same way, once, at the boundary.
There is no code path that acts without being logged, because acting is making the call that gets logged. You can't forget to instrument a feature, because the feature doesn't have its own private way to touch the world.
What a record contains
Every event carries full provenance: the agent that acted, the action it took, the target, the timestamp, the cost in credits, the input it reasoned over, the output it produced, and the authority it acted under — a policy that pre-approved the action or a human who signed off live.
{
"agent": "customer-success-ops",
"action": "send_email",
"target": "renewals@acmecorp.com",
"authorizedBy": "approval:operator/bri",
"timestamp": "2026-03-11T14:02:09Z",
"cost": { "credits": 1.4 },
"input": "ref:ctx-9f2a",
"output": "ref:msg-77c1"
}
That single record answers who, when, and why for one action. Because inputs and outputs are referenced rather than discarded, you can also reconstruct the reasoning behind it — not just that an email went out, but the context the agent was working from when it decided to send one.
Append-only, because history isn't editable
The log is append-only. No event is updated or deleted. If something was recorded wrong, or an action was reversed, the correction is a new event that references the original — the original stands. This is the difference between a record and a draft. You can replay exactly what the system knew and did at 2:14 last night, not a tidied-up version edited after you learned how it turned out. An audit trail that can be edited after the fact isn't an audit trail; it's a story.
Reading the trail is part of operating
A complete record is only useful if you can interrogate it. Every event is queryable, exportable, and replayable. Who can read the audit log is itself governed — viewing the trail is a permission in the same role-based model that decides who can create workflows or approve agent actions, so the record of sensitive activity isn't visible to everyone with a login.
And because the audit trail is reachable through the same protocol as everything else, you don't have to read raw events to use it. An operator can ask the platform directly — "why did the renewal-followup workflow fail at 2:14 last night?" — and the operator-side agent walks the trail, finds the failed call, reads the error, and explains the cause in plain language. The log isn't a forensic artifact you exhume after an incident. It's a live surface you query while you operate.
What this is in service of
Audit logs are retained for 12 months on the Growth plan and 24 months on Scale and Enterprise, and the audit pipeline is part of what our in-progress SOC 2 Type II process examines (expected Q3 2026). But the retention window and the certification are downstream of the real point.
You cannot govern what you cannot review. The whole shift to a function that operates while a human directs it rests on the human being able to see, exactly and completely, what the function did — to confirm the autonomous decisions were the right ones and to catch the one that wasn't. The audit trail is the substrate that the governing job runs on. That's why it can't have gaps, and why we built it so it can't.
For where this sits in the runtime, see how it works. For the control surface that decides what needs a human in the first place, see approval flows.