cpatpa moves AI controls from the user to the admin

PIP shifts LLM configuration into a single admin console built for firm deployments.

infrastructuremulti-tenant

cpatpa has rewired how PIP decides which AI models it talks to. Instead of every user wiring up their own API keys, an administrator now sets the policy: which providers are switched on, which locally-hosted models appear in the picker, and where to point at the in-house AI server. Per-user keys are gone; only what the firm's installer configures gets honoured.

Alongside that, the team raised the throttles that cap how often users can chat, upload, or hit the system - the previous limits read like single-developer testing numbers, not a working firm. They also patched a quiet networking bug where users on modern internet connections could slip past rate limits by shuffling part of their address.

So what Legal-ops leads evaluating PIP for firm-wide rollout should note this is now shaped like an IT-administered tool, not a per-lawyer BYO-key experiment.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

3 commits from cpatpa/PIP, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
76030d6f LLM policy: admin-driven providers + curated local models Claude 2026-05-16 ↗ GitHub
commit body
Replaces the hardcoded model picker with a server-driven list keyed
on org_settings. Adds /admin/llm so an admin can pick which
providers are enabled, override the local LLM base URL at runtime,
and curate which Ollama models appear in the picker. The per-user
API keys surface is removed; only env-supplied keys (set in
install.sh / .env.compose) are honoured.

Schema (migration 0019):
- org_settings.providers_enabled JSONB (master switch per provider).
- org_settings.local_llm_base_url text (runtime override; NULL falls
  back to the env var).
- org_settings.local_llm_models JSONB (curated [{id,label}]).

Backend:
- New lib/llmPolicy.ts: loadLlmPolicy, availableModels,
  assertModelAllowed, orgApiKeys. Centralises gating.
- New lib/localDiscovery.ts: probes /api/tags on the OpenAI-compat
  host to list installed Ollama models.
- New routes: GET /me/models (filtered list for users),
  GET/PATCH /admin/llm, POST /admin/llm/refresh-local.
- llm/index.ts dispatcher now consults the policy on every
  streamChatWithTools / completeText. EXTERNAL_AI_DISABLED env still
  wins for the three external providers.
- Boot reads org_settings.local_llm_base_url and sets
  process.env.LOCAL_LLM_BASE_URL so the local adapter picks it up.
- /user/api-keys GET/PUT removed. user_api_keys table left in place
  for now; a follow-up migration can drop it once we are confident
  no encrypted data needs preservation.
- userSettings.getUserApiKeys now returns env-only keys.
- userApiKeys.ts deleted.

Frontend:
- ModelToggle fetches /me/models on mount, dropping the hardcoded
  catalogue. Empty list prompts the user to ask an admin.
- New /admin/llm page: per-provider toggles, base-URL field, refresh
  button, curated-model checkboxes.
- /account/models page, ApiKeyMissingModal, modelAvailability lib
  all removed. apiKeyStatus / apiKeys / saveApiKey stripped from
  pipApi.ts and UserProfileContext.
- ChatInput, TabularReviewView, TRChatPanel: drop apiKeys plumbing.
  Backend rejection is now the only gate.
- useSelectedModel: persist whatever the picker emits; ModelToggle
  reconciles against the live list on mount.
61689c39 Follow-ups: IPv6 rate-limit, drop user_api_keys, refresh testing doc Claude 2026-05-16 ↗ GitHub
commit body
- Rate limiter keyGenerator now calls ipKeyGenerator from
  express-rate-limit when falling back to IP, which canonicalises
  IPv6 addresses to a /64 prefix. Closes the ERR_ERL_KEY_GEN_IPV6
  warnings printed on every backend boot since the multer 2 / v8
  rate-limit upgrade and prevents IPv6 clients bypassing the IP
  bucket by rotating low-order bits.

- Migration 0020 drops the user_api_keys table. Migration 0019
  moved provider configuration to org_settings and the backend no
  longer reads or writes it; the column held AES-256-GCM ciphertext
  that never escaped the encrypted-at-rest layer, so a hard drop
  is acceptable.

- docs/safe-local-testing.md rewritten to reflect the post-Supabase
  reality (Postgres + Auth.js, AES-encrypted local storage, Admin
  LLM panel, pip-uninstall.sh). The previous content was the
  upstream Mike doc and was misleading.
af2a0d89 Raise default rate limits for internal firm deployment Claude 2026-05-16 ↗ GitHub
commit body
The original defaults (300 general / 30 chat / 60 chat-create / 50
upload per 15-minute window) were tuned tight enough that one
person debugging a flow could easily hit them and see the generic
"Too many requests" message in the browser with no obvious
correlation back to what triggered it.

Raise the defaults to numbers that suit normal multi-user firm
use without rebuilding when an operator wants to bump them
further:

  general     300  -> 1500   per IP, per 15 min
  chat         30  -> 200    per user, per 15 min
  chat-create  60  -> 300    per user, per 15 min
  upload       50  -> 300    per user, per hour

All four remain tunable via the same env var names so deployments
that want stricter limits (or stricter for a specific window) can
still set them. Documented in .env.compose.example.

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-373.md from inside the repo you want the changes in.

⬇ Download capture-thread-373.md