Pre-Phase-7: routing resolver lands before connectors need it

Rather than bake model selection into each connector at ingest time, Archibald312 landed the routing surface first. Local inference (Ollama/vLLM) is deferred to post-launch; the resolver that will eventually dispatch to it is already there, tested, and audited. Every chat today still resolves to the user's requested model - the columns are nullable, so behavior is unchanged.

infrastructurecompliance

Two nullable model_preference text columns - one on projects, one on documents - are the entire schema change. The migration is in backend/migrations/model_routing_seam.sql.

backend/src/lib/llm/routing.ts implements resolveModelRouting() at 156 lines. Precedence is fixed: document-level first, then project-level, then the caller's requested model. If two documents in the same request carry different preferences, the resolver captures the conflict with a structured reason, takes the first non-null value, and records the decision - it doesn't block. Unknown model IDs are rejected with a recorded reason. All outcomes write into the routing_policy_applied jsonb column in audit_log, which Phase 6 provisioned precisely for this purpose.

streamChatWithTools accepts an optional routing context, dispatches against the resolved model, and records the resolution into the audit row. The main chat handler passes the project ID and the document IDs already in the in-memory document index. Because no preferences are set anywhere, every live request still routes to the user's requested model.

The test coverage for an "early seam" PR is unusually thorough: 166 lines across 8 tests covering document-over-project precedence, project-over-request precedence, conflict capture, unknown-model rejection at both layers, DB error tolerance, and the no-document-IDs skip path. Two manual checks in the test plan - a document-level override that audits correctly end-to-end, and a no-document chat that records a request-sourced resolution - were left unchecked at merge.

Phase 7 connectors set documents.model_preference at ingest. Phase 14 adds a local-inference adapter as another target. Neither requires revisiting dispatch sites.

So what Worth importing if your fork will need per-source model routing, per-document data-sensitivity gating, or eventually local inference. Low risk to add: nullable columns, zero behavior change, 8 tests pinning the precedence rules. The cost is wiring your chat dispatch to pass project and document IDs to the resolver - touches the request handlers, not especially deep. Skip if your fork has no multi-model or routing requirements.

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

1 commit from Archibald312/GordonOSS, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`c549ff21`	Pre-Phase-7: per-source LLM routing seam (#7)	Archibald312	2026-05-15	↗ GitHub
commit body * Pre-Phase-7: per-source LLM routing seam Defer original Phase 7 (local inference / Ollama / vLLM) to post-launch and land the per-source LLM routing surface ahead of Phase 7 connectors, so that connectors and the eventual local-inference adapter plug into the same decision point without touching dispatch sites. See decisions.md (2026-05-15) and the renumbered build plan in CLAUDE.md. - model_preference (nullable) columns on projects + documents - backend/src/lib/llm/routing.ts: resolveModelRouting() with doc → project → request precedence, conflicts and unknown-model rejections captured for audit - streamChatWithTools accepts optional routing context, dispatches with the resolved model, and records the policy into the existing audit_log.routing_policy_applied jsonb column - main chat path wired to pass project + document IDs - 8 unit tests covering precedence, conflicts, rejections, and db errors Backend tsc --noEmit clean; Vitest 73/73 passing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * lint: const documentPrefs in routing resolver Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

SHA

Subject

Author

Date

c549ff21

Pre-Phase-7: per-source LLM routing seam (#7)

Archibald312

2026-05-15

↗ GitHub

commit body

* Pre-Phase-7: per-source LLM routing seam

Defer original Phase 7 (local inference / Ollama / vLLM) to post-launch
and land the per-source LLM routing surface ahead of Phase 7 connectors,
so that connectors and the eventual local-inference adapter plug into the
same decision point without touching dispatch sites. See decisions.md
(2026-05-15) and the renumbered build plan in CLAUDE.md.

- model_preference (nullable) columns on projects + documents
- backend/src/lib/llm/routing.ts: resolveModelRouting() with doc → project
  → request precedence, conflicts and unknown-model rejections captured
  for audit
- streamChatWithTools accepts optional routing context, dispatches with
  the resolved model, and records the policy into the existing
  audit_log.routing_policy_applied jsonb column
- main chat path wired to pass project + document IDs
- 8 unit tests covering precedence, conflicts, rejections, and db errors

Backend tsc --noEmit clean; Vitest 73/73 passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* lint: const documentPrefs in routing resolver

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-426.md from inside the repo you want the changes in.