docs: add Phase 12 (Multi-model side-by-side) design

↗ view on GitHub · Claude · 2026-05-16 · 876049f9

Adds a compare mode that lets an authorised user run the same prompt
against 2 or 3 models in parallel.

Data model:

- chats gets mode ('standard'|'compare') and compare_models text[].
- chat_messages gets turn_index, branch_index, model_id, and three
  cost-related columns (input_tokens, output_tokens, cost_cents).
- Per-message cost capture is on for every chat from this phase,
  including standard chats. Cheap side-benefit feeds a new admin
  Cost summary tab.
- New model_prices reference table, admin-editable.

Backend:

- Compare-aware POST /api/chats/:id/messages fans out N provider
  calls with Promise.all and multiplexes SSE chunks tagged with
  branch_index.
- Per-branch regenerate endpoint replaces one assistant row in
  place.
- Fork-to-standard endpoint clones a compare chat's history with
  one branch's responses into a new standard chat.

Frontend:

- Mode toggle on new-chat composer. Multi-select model picker
  showing availability per model. Cost amplification note.
- Chat view renders user turns full-width and assistant turns as
  N columns side by side on desktop, tabs on mobile.
- Per-column streaming with regen and keep-this actions, error
  state with retry.

Policy and gating:

- org_settings.allow_compare_mode (default off),
  compare_mode_max_models (cap 3, must be 2 or 3),
  compare_mode_admin_only (default on).
- New chats.compare capability seeded into the Phase 11
  permissions catalogue with owner+admin default.
- allow_external_models and EXTERNAL_AI_DISABLED both gate which
  models appear in the picker. Disallowed models are not
  selectable.

Tools in compare mode are disabled in Phase 12. Memory injection
still happens (read-only path); add_memory cannot fire because
tools are off.

Single migration 0025_compare_mode.sql with one-shot backfill of
turn_index for existing chats and seeded model_prices entries.

Five open questions parked for review: per-branch regen during
streaming, RAG cost attribution (deferred to Phase 13), locked
model set, per-user concurrency cap, mobile UX.
Repository cpatpa/PIP
Author Claude <noreply@anthropic.com>
Authored
Parents f96b450a
Stats 1 file changed , +578
Part of Phases 10-14 - design docs for web search, groups, multi-model, vector RAG, knowledge collections

Capture this commit into my fork

Download a Markdown prompt that tells Claude how to port this exact commit into your working tree. Run it via claude -p < capture-commit-876049f9.md from inside the repo you want the change in.

⬇ Download capture-commit-876049f9.md