cpatpa decides which AI models a firm is actually allowed to use

Access control for the AI engine now runs through one gate, with a default that exposes nothing until an admin signs off.

securityinfrastructure

cpatpa rebuilt the way this fork controls which AI providers and models staff can reach. Instead of permission checks scattered across the codebase, every request now passes through a single policy layer before any model runs. Admins can flip individual providers on or off, and - importantly - locally hosted models stay hidden until someone curates an approved list. A firm that hasn't reviewed its local models yet exposes none of them, even if they're quietly running in the background. That's the cautious default you'd want.

The same work removed an old per-user key store that was no longer used, closed a loophole that let certain clients slip past usage limits by rotating their network address, and loosened the request caps that were originally tuned for public web traffic. Behind a firm's own network, the old limits throttled normal use; the new ones are sized for an internal deployment and adjustable.

So what If you're running this inside a firm, this is the difference between hoping nobody points the system at an unvetted AI model and being able to guarantee they can't.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

3 commits from cpatpa/PIP, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`76030d6f`	LLM policy: admin-driven providers + curated local models	Claude	2026-05-16	↗ GitHub
commit body Replaces the hardcoded model picker with a server-driven list keyed on org_settings. Adds /admin/llm so an admin can pick which providers are enabled, override the local LLM base URL at runtime, and curate which Ollama models appear in the picker. The per-user API keys surface is removed; only env-supplied keys (set in install.sh / .env.compose) are honoured. Schema (migration 0019): - org_settings.providers_enabled JSONB (master switch per provider). - org_settings.local_llm_base_url text (runtime override; NULL falls back to the env var). - org_settings.local_llm_models JSONB (curated [{id,label}]). Backend: - New lib/llmPolicy.ts: loadLlmPolicy, availableModels, assertModelAllowed, orgApiKeys. Centralises gating. - New lib/localDiscovery.ts: probes /api/tags on the OpenAI-compat host to list installed Ollama models. - New routes: GET /me/models (filtered list for users), GET/PATCH /admin/llm, POST /admin/llm/refresh-local. - llm/index.ts dispatcher now consults the policy on every streamChatWithTools / completeText. EXTERNAL_AI_DISABLED env still wins for the three external providers. - Boot reads org_settings.local_llm_base_url and sets process.env.LOCAL_LLM_BASE_URL so the local adapter picks it up. - /user/api-keys GET/PUT removed. user_api_keys table left in place for now; a follow-up migration can drop it once we are confident no encrypted data needs preservation. - userSettings.getUserApiKeys now returns env-only keys. - userApiKeys.ts deleted. Frontend: - ModelToggle fetches /me/models on mount, dropping the hardcoded catalogue. Empty list prompts the user to ask an admin. - New /admin/llm page: per-provider toggles, base-URL field, refresh button, curated-model checkboxes. - /account/models page, ApiKeyMissingModal, modelAvailability lib all removed. apiKeyStatus / apiKeys / saveApiKey stripped from pipApi.ts and UserProfileContext. - ChatInput, TabularReviewView, TRChatPanel: drop apiKeys plumbing. Backend rejection is now the only gate. - useSelectedModel: persist whatever the picker emits; ModelToggle reconciles against the live list on mount.
`61689c39`	Follow-ups: IPv6 rate-limit, drop user_api_keys, refresh testing doc	Claude	2026-05-16	↗ GitHub
commit body - Rate limiter keyGenerator now calls ipKeyGenerator from express-rate-limit when falling back to IP, which canonicalises IPv6 addresses to a /64 prefix. Closes the ERR_ERL_KEY_GEN_IPV6 warnings printed on every backend boot since the multer 2 / v8 rate-limit upgrade and prevents IPv6 clients bypassing the IP bucket by rotating low-order bits. - Migration 0020 drops the user_api_keys table. Migration 0019 moved provider configuration to org_settings and the backend no longer reads or writes it; the column held AES-256-GCM ciphertext that never escaped the encrypted-at-rest layer, so a hard drop is acceptable. - docs/safe-local-testing.md rewritten to reflect the post-Supabase reality (Postgres + Auth.js, AES-encrypted local storage, Admin LLM panel, pip-uninstall.sh). The previous content was the upstream Mike doc and was misleading.
`af2a0d89`	Raise default rate limits for internal firm deployment	Claude	2026-05-16	↗ GitHub
commit body The original defaults (300 general / 30 chat / 60 chat-create / 50 upload per 15-minute window) were tuned tight enough that one person debugging a flow could easily hit them and see the generic "Too many requests" message in the browser with no obvious correlation back to what triggered it. Raise the defaults to numbers that suit normal multi-user firm use without rebuilding when an operator wants to bump them further: general 300 -> 1500 per IP, per 15 min chat 30 -> 200 per user, per 15 min chat-create 60 -> 300 per user, per 15 min upload 50 -> 300 per user, per hour All four remain tunable via the same env var names so deployments that want stricter limits (or stricter for a specific window) can still set them. Documented in .env.compose.example.

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-373.md from inside the repo you want the changes in.

⬇ Download capture-thread-373.md