nwhitehouse gives every agent turn a paper trail

Mike's agent now keeps a structured record of what it did on each turn - built so the log never becomes a second pile of sensitive data.

complianceanalytics

Until now, the only way to see how the agent worked through a request was to dig through raw server logs. nwhitehouse replaces that with a proper record: each turn writes structured entries marking when it started, how quickly the model began responding, which tools it reached for, and whether those calls succeeded or failed.

The deliberate choice is what it leaves out. The record holds only metadata - tool names, timings, error codes, step counts - and never the user's prompts or the contents of documents, which already live with the chat itself. That keeps the trail cheap to query and avoids duplicating sensitive material into a new place it could leak from. Access to the trail follows the same rules as access to the chat it belongs to.

So what Anyone who needs to answer "what did the AI actually do here" - for an audit, an incident review, or a client question - now gets a queryable answer instead of a log-file scavenge.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

1 commit from nwhitehouse/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
afeb5d8c [feat-015] Structured agent_events audit log Nick Whitehouse 2026-05-07 ↗ GitHub
commit body
Replaces the 60+ console.log calls in chatTools.ts + llm/olava.ts as the
only debug trail for agent activity per chat turn. Now you can run
`select * from agent_events where chat_id = ? order by created_at` and
reconstruct the agent's path: when each tool fired, how long the batch
took, total steps + total latency for the whole turn.

Schema (migration 004):
- agent_events(id, chat_id, type, payload jsonb, created_at)
- RLS via can_access_chat() - same predicate as chat_messages, so the
  audit log inherits chat permissions.
- Append-only.

Helper (agentEvents.ts):
- recordEvent({db, chatId, type, payload}) - fire-and-forget. Failures
  log + swallow; an audit hiccup must never break a chat stream.
- AgentEventType union: turn.started, model.first_token,
  tool.call_started, tool.call_succeeded, tool.call_failed,
  loop.escalated (feat-014), turn.completed.

Wired in runLLMStream:
- turn.started at top with model name
- tool.call_started before each runToolCalls dispatch (args_keys, no values)
- tool.call_succeeded after each call (name, batch latency, result_length)
- turn.completed at end (total_steps, total_latency_ms, text_length)

Hard PII rule: payload carries metadata only. NEVER user prompts, doc
text, or tool result bodies - chat_messages already holds those.

tool.call_failed is reserved for feat-016 (error envelope) - until then,
all tools dispatch as succeeded. loop.escalated is reserved for feat-014.

Stacked on feat-017-memory-hierarchy (uses the chatId param introduced
there). Will rebase cleanly onto main alongside feat-017.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-108.md from inside the repo you want the changes in.

⬇ Download capture-thread-108.md