cpatpa decides which AI models a firm is actually allowed to use

Access control for the AI engine now runs through one gate, with a default that exposes nothing until an admin signs off.

securityinfrastructure

cpatpa rebuilt the way this fork controls which AI providers and models staff can reach. Instead of permission checks scattered across the codebase, every request now passes through a single policy layer before any model runs. Admins can flip individual providers on or off, and - importantly - locally hosted models stay hidden until someone curates an approved list. A firm that hasn't reviewed its local models yet exposes none of them, even if they're quietly running in the background. That's the cautious default you'd want.

The same work removed an old per-user key store that was no longer used, closed a loophole that let certain clients slip past usage limits by rotating their network address, and loosened the request caps that were originally tuned for public web traffic. Behind a firm's own network, the old limits throttled normal use; the new ones are sized for an internal deployment and adjustable.

So what If you're running this inside a firm, this is the difference between hoping nobody points the system at an unvetted AI model and being able to guarantee they can't.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

3 commits from cpatpa/PIP, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
76030d6f LLM policy: admin-driven providers + curated local models Claude 2026-05-16 ↗ GitHub
commit body
Replaces the hardcoded model picker with a server-driven list keyed
on org_settings. Adds /admin/llm so an admin can pick which
providers are enabled, override the local LLM base URL at runtime,
and curate which Ollama models appear in the picker. The per-user
API keys surface is removed; only env-supplied keys (set in
install.sh / .env.compose) are honoured.

Schema (migration 0019):
- org_settings.providers_enabled JSONB (master switch per provider).
- org_settings.local_llm_base_url text (runtime override; NULL falls
  back to the env var).
- org_settings.local_llm_models JSONB (curated [{id,label}]).

Backend:
- New lib/llmPolicy.ts: loadLlmPolicy, availableModels,
  assertModelAllowed, orgApiKeys. Centralises gating.
- New lib/localDiscovery.ts: probes /api/tags on the OpenAI-compat
  host to list installed Ollama models.
- New routes: GET /me/models (filtered list for users),
  GET/PATCH /admin/llm, POST /admin/llm/refresh-local.
- llm/index.ts dispatcher now consults the policy on every
  streamChatWithTools / completeText. EXTERNAL_AI_DISABLED env still
  wins for the three external providers.
- Boot reads org_settings.local_llm_base_url and sets
  process.env.LOCAL_LLM_BASE_URL so the local adapter picks it up.
- /user/api-keys GET/PUT removed. user_api_keys table left in place
  for now; a follow-up migration can drop it once we are confident
  no encrypted data needs preservation.
- userSettings.getUserApiKeys now returns env-only keys.
- userApiKeys.ts deleted.

Frontend:
- ModelToggle fetches /me/models on mount, dropping the hardcoded
  catalogue. Empty list prompts the user to ask an admin.
- New /admin/llm page: per-provider toggles, base-URL field, refresh
  button, curated-model checkboxes.
- /account/models page, ApiKeyMissingModal, modelAvailability lib
  all removed. apiKeyStatus / apiKeys / saveApiKey stripped from
  pipApi.ts and UserProfileContext.
- ChatInput, TabularReviewView, TRChatPanel: drop apiKeys plumbing.
  Backend rejection is now the only gate.
- useSelectedModel: persist whatever the picker emits; ModelToggle
  reconciles against the live list on mount.
61689c39 Follow-ups: IPv6 rate-limit, drop user_api_keys, refresh testing doc Claude 2026-05-16 ↗ GitHub
commit body
- Rate limiter keyGenerator now calls ipKeyGenerator from
  express-rate-limit when falling back to IP, which canonicalises
  IPv6 addresses to a /64 prefix. Closes the ERR_ERL_KEY_GEN_IPV6
  warnings printed on every backend boot since the multer 2 / v8
  rate-limit upgrade and prevents IPv6 clients bypassing the IP
  bucket by rotating low-order bits.

- Migration 0020 drops the user_api_keys table. Migration 0019
  moved provider configuration to org_settings and the backend no
  longer reads or writes it; the column held AES-256-GCM ciphertext
  that never escaped the encrypted-at-rest layer, so a hard drop
  is acceptable.

- docs/safe-local-testing.md rewritten to reflect the post-Supabase
  reality (Postgres + Auth.js, AES-encrypted local storage, Admin
  LLM panel, pip-uninstall.sh). The previous content was the
  upstream Mike doc and was misleading.
af2a0d89 Raise default rate limits for internal firm deployment Claude 2026-05-16 ↗ GitHub
commit body
The original defaults (300 general / 30 chat / 60 chat-create / 50
upload per 15-minute window) were tuned tight enough that one
person debugging a flow could easily hit them and see the generic
"Too many requests" message in the browser with no obvious
correlation back to what triggered it.

Raise the defaults to numbers that suit normal multi-user firm
use without rebuilding when an operator wants to bump them
further:

  general     300  -> 1500   per IP, per 15 min
  chat         30  -> 200    per user, per 15 min
  chat-create  60  -> 300    per user, per 15 min
  upload       50  -> 300    per user, per hour

All four remain tunable via the same env var names so deployments
that want stricter limits (or stricter for a specific window) can
still set them. Documented in .env.compose.example.

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-373.md from inside the repo you want the changes in.

⬇ Download capture-thread-373.md