nwhitehouse gives Mike a flight recorder

Every multi-step turn the AI takes now leaves a structured trail you can actually query after the fact.

complianceanalytics

Most legal-AI tools bury their internal reasoning in developer console logs - fine during testing, useless the morning after a partner asks "what exactly did this thing do on that matter?". nwhitehouse has moved that trail into a proper database table. Every chat turn now writes a structured record: when the agent started, which tools it reached for, which succeeded, how long the whole thing took.

Two design choices stand out. The audit table inherits the same access rules as the conversation itself - anyone barred from the chat is barred from the trail, automatically. And the records hold metadata only: no prompts, no document text, no tool outputs. That keeps the log lean and stops sensitive client material from being duplicated into yet another place it has to be secured and retained.

So what If you run Mike anywhere "show your work" matters - regulated practice areas, client-facing matters, internal QA - this is the difference between hoping the AI behaved and being able to prove it.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

1 commit from nwhitehouse/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
afeb5d8c [feat-015] Structured agent_events audit log Nick Whitehouse 2026-05-07 ↗ GitHub
commit body
Replaces the 60+ console.log calls in chatTools.ts + llm/olava.ts as the
only debug trail for agent activity per chat turn. Now you can run
`select * from agent_events where chat_id = ? order by created_at` and
reconstruct the agent's path: when each tool fired, how long the batch
took, total steps + total latency for the whole turn.

Schema (migration 004):
- agent_events(id, chat_id, type, payload jsonb, created_at)
- RLS via can_access_chat() - same predicate as chat_messages, so the
  audit log inherits chat permissions.
- Append-only.

Helper (agentEvents.ts):
- recordEvent({db, chatId, type, payload}) - fire-and-forget. Failures
  log + swallow; an audit hiccup must never break a chat stream.
- AgentEventType union: turn.started, model.first_token,
  tool.call_started, tool.call_succeeded, tool.call_failed,
  loop.escalated (feat-014), turn.completed.

Wired in runLLMStream:
- turn.started at top with model name
- tool.call_started before each runToolCalls dispatch (args_keys, no values)
- tool.call_succeeded after each call (name, batch latency, result_length)
- turn.completed at end (total_steps, total_latency_ms, text_length)

Hard PII rule: payload carries metadata only. NEVER user prompts, doc
text, or tool result bodies - chat_messages already holds those.

tool.call_failed is reserved for feat-016 (error envelope) - until then,
all tools dispatch as succeeded. loop.escalated is reserved for feat-014.

Stacked on feat-017-memory-hierarchy (uses the chatId param introduced
there). Will rebase cleanly onto main alongside feat-017.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-108.md from inside the repo you want the changes in.

⬇ Download capture-thread-108.md