Phase 6: audit logging for LLM + tool invocations

✅ merged · #6 · Archibald312/GordonOSS ← Archibald312/GordonOSS · opened 11d ago by Archibald312 · merged 11d ago by Archibald312 · self · +608-14 across 13 files · ↗ on GitHub

From the PR description

Summary

Phase 6 of the Mike → GordonOSS finance fork build plan: append-only audit logging for every LLM call and every tool invocation. Required groundwork before paid data connectors (Phase 8) so we can prove who saw what data and which model handled it.

What changed

  • New audit_log table with immutability triggers (UPDATE/DELETE raise audit_log entries are immutable). Migration in backend/migrations/audit_log.sql and appended to backend/schema.sql. Indexes on (user_id, created_at), (project_id, created_at), and event_type.
  • backend/src/lib/audit.ts - AuditEntry shape, recordAudit() (fire-and-forget; never throws), hashContent() SHA-256 helper, AUDIT_LOG_ENABLED feature flag.
  • Tool dispatcher (lib/tools/registry.ts) records one tool_call row per invocation with duration_ms, input/output hashes, and resolved document_ids (extracted from both args and side effects). Errors are recorded then re-thrown.
  • LLM adapter (lib/llm/index.ts) records one llm_call row per streamChatWithTools call with provider, model, hashes, and duration. Audit context flows through runLLMStream (used by /chat, /projects/:id/chat, /tabular-review/:id/chat) and the tabular generate path.
  • GET /audit-log - user-scoped reader with project_id, event_type, from, to, limit (default 100, max 1000), and offset filters.
  • Unit tests cover hashContent determinism, insert shape, feature-flag no-op, and error swallowing.

What's NOT in this PR

The schema's event_type CHECK reserves connector_fetch, document_upload, and document_download but no code path emits them yet - they land in Phase 8 (connectors) and a follow-up that instruments the documents/downloads routes. No migration needed when those wire up.

Reviewer notes

  • Audit failures are silent by design: recordAudit catches all errors and console.errors them. The build plan calls this out - audit logging must never break a user request.
  • Input/output hashes only: we store SHA-256 of the inputs and outputs, never raw content. Forensic traceability without leaking message content into a second table.
  • user_email is denormalized: user_id is ON DELETE SET NULL so historical entries survive a user deletion; the email column preserves attribution for investigators.
  • Per-iteration vs per-call: streamChatWithTools may internally drive multiple provider iterations (tool-use loops). We record one row per outer call rather than one per iteration - keeps the table compact and matches "one user message = one llm_call row."

Test plan

  • npx vitest run in backend/ - 65/65 passing (7 new in audit.test.ts)
  • npx tsc --noEmit clean
  • npx eslint src - 0 errors (pre-existing warnings unrelated)
  • Apply backend/migrations/audit_log.sql to Supabase, send a chat message, confirm audit_log contains both llm_call and tool_call rows
  • UPDATE audit_log SET status='error' WHERE id = ... should raise the immutability exception
  • GET /audit-log returns only the caller's own rows

🤖 Generated with Claude Code

Our analysis

Append-only audit logging for LLM and tool calls — read the full analysis →

Think the analysis missed something the PR description covers?

Capture this PR into my fork

Download a Markdown prompt that tells Claude how to port every commit in this PR into your working tree. Run it via claude -p < capture-pull-6.md from inside the repo you want the changes in.

⬇ Download capture-pull-6.md