cpatpa hardens Mike for offline-only deployments

A week-long bug hunt exposes what breaks when a fork runs purely on local AI with no cloud providers allowed.

chat-uiinfrastructure

cpatpa's fork is being shaped for installs where outside AI services are switched off and only a local model runs on the firm's own hardware - the kind of setup a privacy-conscious practice or a regulated in-house team would actually deploy. The problem: chat kept refusing to answer, claiming external providers were disabled, even though the user had clearly picked a local model.

The team peeled the onion. They tightened error messages so failures named the offending model. They fixed title generation and table extraction, which had been quietly hardcoded to reach for cloud models. They closed a browser-side race where the Send button fired before the model list had loaded. And they made the chat engine substitute a working local model instead of crashing when handed a stale one. The actual root cause turned out to be a single list that forgot local models existed.

So what Anyone evaluating Mike for an on-prem or air-gapped deployment should watch this fork - it's becoming the version that actually behaves when the cloud is turned off.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

7 commits from cpatpa/PIP, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
953bd2a3 assertModelAllowed: include the offending model id in error Claude 2026-05-16 ↗ GitHub
commit body
When the chat dispatch rejects a request because the resolved
provider is gated off, the error message was generic ("External
AI providers are disabled by organisation policy.") with no
indication of which model id triggered it. That makes
"why isn't my chat working?" issues much harder to diagnose
when the UI claims one model is selected but the request body
carries another (stale localStorage, default-model fallback, etc.).

Append the model id and resolved provider to the message so the
backend log line names the actual culprit.
afc21cea Title + tabular: fall back to local LLM when externals disabled Claude 2026-05-16 ↗ GitHub
commit body
Title generation was hardcoded to pick from Gemini / OpenAI nano /
Claude Haiku regardless of policy. On a deployment where
EXTERNAL_AI_DISABLED=true and local Ollama is the only provider,
every title call (and every tabular call) threw "External AI
providers are disabled by organisation policy", surfacing as a
500 on POST /chat/<id>/generate-title even though the chat itself
worked.

userSettings.resolveTitleModel and a new resolveTabularModel now
consult the LlmPolicy: prefer external when explicitly allowed
AND a key is present, otherwise fall back to the first curated
local model (`local/<id>`), otherwise return null.

Callers updated:
- chat.generate-title: when title_model is null, write a trimmed
  message snippet as the title and return 200. Best-effort; the
  red error toast no longer fires when chat itself is fine.
- tabular.compose-column-prompt and the two cell-extraction paths:
  503 with a clear "ask an admin" message instead of throwing.
- tabular chat-title generation: skip silently rather than crash
  the tabular chat exchange.
30fa8010 Fix race: chat dispatched stale model id before picker loaded Claude 2026-05-16 ↗ GitHub
commit body
If a user clicked Send on a freshly-loaded chat before /me/models
resolved, useSelectedModel still held the legacy default
(`gemini-3-flash-preview` or a stale localStorage value). The
backend's assertModelAllowed then rejected with "External AI
providers are disabled" even though the picker UI was about to
auto-fall-back to a valid local model. The picker just hadn't
caught up yet.

Move the validation into useSelectedModel itself: on mount it
fetches the live available-models list, validates the stored
selection against it, and only THEN emits a usable id. Returns a
third element `ready` so callers can disable Send while we don't
yet have a verified-valid selection.

ChatInput now early-returns on submit when !modelReady, and the
Send button is disabled until the list has resolved. ModelToggle's
own fallback-onChange is removed since useSelectedModel handles it
authoritatively (avoids double-emit churn).
532738c5 chat: tolerant model resolution instead of throwing on stale ids Claude 2026-05-16 ↗ GitHub
commit body
A user-visible chat would hard-fail with "External AI providers are
disabled by organisation policy. (model='gemini-3-flash-preview', ...)"
when the request body carried a model id that's no longer allowed
by the current policy. This happened most reliably on the
auto-send path from InitialView -> /assistant/chat/<id>: the new
chat's queued first message has whatever model the InitialView's
ChatInput held at submit time, and that can be stale if the bundle
was loaded before /admin/llm was configured.

Add resolveAllowedModel(requested, policy) to llmPolicy: returns
the requested id if allowed, otherwise the first allowed id, or
null when no provider is enabled at all.

POST /chat and POST /projects/:id/chat now resolve before
dispatching. Substitutions are logged so admins can see them. A
truly empty policy (no provider enabled) returns 503 with a clear
"ask an admin" message rather than a generic 500.

assertModelAllowed is still used by streamChatWithTools as the
hard gate for direct calls (e.g. tabular extraction); the chat
streaming path now never reaches it with a forbidden id.
dc31b293 LLM dispatcher: self-heal stale model ids at the deepest layer Claude 2026-05-16 ↗ GitHub
commit body
The chat dispatch had been crashing with "External AI providers
are disabled (model='gemini-3-flash-preview', provider='google')"
even after the chat.ts route added a tolerant resolveAllowedModel
substitution. That route-level fix only covers /chat and
/projects/:id/chat; tabular and any future caller would still
trip the original assertModelAllowed.

Move the resolve logic INTO streamChatWithTools and completeText.
Now any caller that hands the dispatcher a stale model id sees
the dispatcher substitute the first allowed model, log a warning,
and proceed. The only way to surface an error to the user is
truly empty policy (no provider enabled), which throws with a
clear "ask an admin" message.

Net effect:
- /chat, /projects/:id/chat: route-level resolve runs first, no
  change for happy path.
- Tabular review and any other path that calls runLLMStream /
  streamChatWithTools / completeText: now also tolerant.
- Title generation: already nullable; falls through here too.

The assertModelAllowed helper stays in llmPolicy.ts for callers
that genuinely want a hard gate, but the LLM dispatcher no longer
uses it on the streaming chat path.
4a5fa1f0 chat: add a one-line trace log for model resolution Claude 2026-05-16 ↗ GitHub
commit body
There is a deployment in the field where the route-level
resolveAllowedModel demonstrably substitutes 'gemini-3-flash-preview'
-> 'local/qwen3-next:80b' when invoked from a test harness against
the same policy, yet the same request flow still throws
"External AI providers are disabled (model='gemini-3-flash-preview')"
at runtime with no [chat/stream] substituting warning in the log.

Add a single console.log that prints the body model, parsed model,
resolved model, and the two key policy booleans on every chat
stream. One log line per request, so the disconnect surfaces
unambiguously the next time it bites.

Remove the log once the cause is identified.
ac3ff6a9 resolveModel: accept local/* model ids Claude 2026-05-16 ↗ GitHub
commit body
runLLMStream in chatTools.ts re-resolved its incoming model via
resolveModel(model, DEFAULT_MAIN_MODEL). resolveModel checked the
hardcoded ALL_MODELS set, which only contained the external
provider catalogue (claude, gemini, openai variants) -- not the
local/* prefix used for Ollama models. So even after chat.ts had
correctly substituted the request body's stale 'gemini-3-flash-preview'
into 'local/llama3.2:3b', runLLMStream's internal resolveModel
silently kicked it BACK to 'gemini-3-flash-preview' (the
DEFAULT_MAIN_MODEL fallback), which then tripped the deepest-layer
self-heal we added in llm/index.ts and produced
"[llm] requested model 'gemini-3-flash-preview' not allowed;
substituting 'local/qwen3-next:80b'" -- not the model the user
actually picked.

The fix is local-aware: treat any id starting with `local/` as a
valid model the caller already validated, so resolveModel passes
it through instead of forcing the gemini fallback.

Also drop the debug trace log from chat.ts now that the cause is
identified.

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-462.md from inside the repo you want the changes in.

⬇ Download capture-thread-462.md