cpatpa ships five phases in one squash: admin API, local storage, Ollama adapter, Docker, and rate-limiting

A single mega-commit (`9166a01d`, 1349 added lines across 24 files) bundles Phases 4-8: the admin backend, an AES-256-GCM local storage driver, an Ollama adapter, Docker Compose deployment, and per-user rate limiting. A follow-on commit adds tool-call support to the Ollama adapter.

infrastructurecompliance

Phase 4 adds routes/admin.ts behind requireAuth + requireAdmin. User management covers list, role change, status change, and delete with last-admin protection and self-mutation refused at the route layer. GET/PATCH /admin/org-settings has typed per-field validation; every change emits an admin.* audit event. GET /admin/audit is paginated with optional action and user_id filters.

Phase 5 is the highest-value piece for direct porting. lib/storageLocal.ts stores files as [12-byte IV][16-byte GCM tag][ciphertext] under a configurable root path. Master key is SHA-256(STORAGE_ENCRYPTION_KEY). Path traversal is defended against by resolving the target to an absolute path and confirming it stays under the root before any read or write. lib/storage.ts becomes a facade keyed on STORAGE_DRIVER (local defaults on Docker; R2/MinIO remains available). Signed URLs return null in local mode; callers fall back to the /download/:token route. The storage module is about 250 lines self-contained. The test confirmation from the changelog is concrete: a 25-byte plaintext produces a 53-byte on-disk file (12 IV + 16 tag + 25 ciphertext), and the plaintext does not appear in the file.

Phase 6 adds lib/llm/local.ts for OpenAI-compatible endpoints (Ollama, vLLM, LM Studio). Models with a local/ prefix route here; the suffix is forwarded verbatim as the upstream model id. The initial commit ships streaming but drops tool definitions silently. A follow-on commit (9fe6a8e9) fills that gap: tools are forwarded in the request body, streamed tool_calls are assembled from index-keyed argument fragments, the caller's runTools callback is invoked, role=tool result messages are appended, and the loop continues until the model produces content with no further tool calls or maxIterations is reached.

Phase 7 gives the fork a complete Docker Compose deployment. The backend Dockerfile is multi-stage with LibreOffice baked in and runs as a non-root user. docker-entrypoint.sh runs npm run migrate before starting the server (toggle via MIGRATE_ON_BOOT). The compose file brings up postgres, backend, frontend, ollama, caddy, and a backup sidecar that runs nightly pg_dump. Caddy handles Let's Encrypt TLS for ${PIP_DOMAIN}; internal-CA users swap in an explicit tls cert key directive.

Phase 8 fixes two audit findings. Rate limiters now key on res.locals.userId for the chat, chat-create, and upload paths - behind a corporate NAT, one user's burst no longer drains the bucket for the whole firm. extractStructureTree output goes through sanitiseStructureTitle (strip control chars, escape angle brackets, length cap) before reaching downstream renderers.

So what The local storage driver with AES-256-GCM encryption and the Ollama adapter with tool-call loop are each self-contained enough to pull into another fork with a guided rewrite. The admin API + audit-on-every-change pattern is a good reference template. The Dockerfiles and Caddyfile are tuned for this fork's Piper Alderman deployment context, so treat them as a starting point rather than drop-in. Any cherry-pick from `9166a01d` is effectively a manual rewrite anyway - it's a 24-file squash.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

3 commits from cpatpa/PIP, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`9166a01d`	Phases 4-8: admin backend, storage driver, local LLM, Docker, polish	Claude	2026-05-15	↗ GitHub
commit body Closes audit findings C1, H4 (sandboxing follow-up parked), H6, M2. Phase 4 - admin backend. - routes/admin.ts mounted at /admin/*, gated by requireAuth + requireAdmin. - GET / PATCH / DELETE /admin/users (last-admin protection; self-mutation refused). - GET / PATCH /admin/org-settings with typed validation per field; audit_events records every change. - GET /admin/audit paginated; optional action and user_id filters. - Frontend admin pages land later; the API is complete. Phase 5 - storage driver + at-rest encryption + download tokens. - lib/storageLocal.ts: local driver. Each file on disk is [12-byte IV][16-byte GCM tag][ciphertext]. Master key is SHA-256(STORAGE_ENCRYPTION_KEY). Path traversal: resolved absolute path must stay under STORAGE_LOCAL_PATH. - lib/storage.ts becomes a facade keyed on STORAGE_DRIVER: local (default for Docker) or s3 (existing R2 / MinIO). Signed URLs return null in local mode; callers fall back to /download/:token. - End-to-end verified: 25-byte plaintext -> 53-byte on-disk file; plaintext does not appear; round-trip preserved. - lib/downloadTokens.ts payload adds u (user_id) + exp; /download/:token refuses tokens issued to a different user or past expiry. TTL defaults to 24h, env-tunable. Closes C1. Phase 6 - local LLM + EXTERNAL_AI_DISABLED. - lib/llm/local.ts: OpenAI-compatible adapter for Ollama / vLLM / LM Studio. Streaming + completion. Models prefixed "local/" route here; suffix sent verbatim as the upstream model id. Tools not yet wired (follow-up). - lib/llm/index.ts adds assertProviderAllowed: when EXTERNAL_AI_DISABLED=true, refuses Claude/Gemini/OpenAI dispatch. Local always passes. - Provider enum extended to include "local". - tabular.ts missingModelApiKey skips the check for local models (gated by LOCAL_LLM_BASE_URL on the backend). Phase 7 - Docker / Caddy / deployment. - backend/Dockerfile multi-stage (deps -> tsc build -> slim runtime). LibreOffice baked in. Non-root user. Entrypoint runs `npm run migrate` (toggle via MIGRATE_ON_BOOT) then starts the server. - frontend/Dockerfile multi-stage. next.config.ts: output: "standalone". - docker-compose.yml at repo root. Services: postgres, backend, frontend, ollama, caddy, backup. Every persistent volume mounts under ${DATA_ROOT}. Backend healthcheck uses /ready. Backup sidecar runs nightly pg_dump with BACKUP_KEEP_DAYS retention. - caddy/Caddyfile terminates TLS via Let's Encrypt for ${PIP_DOMAIN}; forwards backend paths to backend:3001 and everything else to frontend:3000. - .env.compose.example documents every required and optional var. Phase 8 - rate limits + structure_tree sanitisation. - rate-limit keyGenerator keys on res.locals.userId for chat, chat-create, and upload paths; pre-auth requests still fall back to IP. Closes H6. - sanitiseStructureTitle strips control chars, escapes angle brackets, caps length. Applied to PDF outline titles and DOCX mammoth-extracted lines before storage. Closes M2. Type-check clean across the backend. All 16 migrations apply cleanly. Storage driver smoke-tested. Remaining outstanding (post-MVP polish): - Frontend workspace switcher + workspace settings page. - Frontend admin pages (Users / AI Policy / Audit). - Frontend Account page polish for the new per-user fields. - Tool support on the local LLM adapter so edit / generate / read flows work with Ollama-class models.
`7d44fef6`	Frontend admin console: Users, AI Policy, Audit Log	Claude	2026-05-15	↗ GitHub
commit body Adds the admin-only /admin section reachable from the user dropdown when the JWT role is admin. Mirrors the backend admin API: /admin/users list, change role/status, delete (with self-guard and last-admin guard already enforced server-side) /admin/policy org system prompt, allow_external_models toggle, user-instructions toggle, default model, retention, timezone, banner text, and the jurisdiction/ practice-area/sector/allowed-domain lists /admin/audit paginated audit log with action + user filters Layout enforces role=admin client-side and the backend re-checks via requireAdmin so the gate is real, not cosmetic.
`9fe6a8e9`	Local LLM adapter: tool-call loop	Claude	2026-05-15	↗ GitHub
commit body Adds OpenAI-style tool support to the local Ollama adapter. The adapter now forwards the caller's tool schemas, assembles tool_calls from the streamed delta chunks (deltas arrive keyed by index with arguments streaming in fragments), and runs them via the supplied runTools callback. Tool results are appended as role=tool messages with tool_call_id and the loop re-streams until the model produces content with no further tool calls or maxIterations is hit. Models that do not support tools simply never emit tool_calls, so the loop exits after the first turn. Unparseable JSON arguments are passed through under _raw_arguments rather than crashing the turn.

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-365.md from inside the repo you want the changes in.

⬇ Download capture-thread-365.md