cpatpa lays out the next five moves before writing a line of code

A design-doc dump that maps where this fork wants to take search, permissions, model comparison, retrieval, and document collections.

knowledge-managementsearch

Five docs-only commits spell out the architecture for the fork's next phases: web search with a hardened defence against servers being tricked into fetching internal addresses; a real groups-and-permissions system to replace the current ad-hoc sharing; side-by-side comparison of two or three models on the same prompt with per-message cost capture turned on by default; and a foundational retrieval upgrade using semantic search over chunked documents, blending meaning-based and keyword scoring.

The capstone is knowledge collections - named groupings of documents users can point a chat or review at. The rule cpatpa commits to: putting a document in a collection never widens who can read it. A shared posture runs through all five - conservative defaults at the org level, narrowing-only overrides per workspace, audit events on every meaningful action, and a kill-switch for rollback.

So what Legal-ops leads evaluating Mike-based products should read these as a credible blueprint for how a firm-grade deployment ought to handle permissions and retrieval - even before the code lands.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

5 commits from cpatpa/PIP, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
fb928a96 docs: add Phase 10 (Web search) design Claude 2026-05-16 ↗ GitHub
commit body
Covers two LLM tools (web_search, fetch_url) with SearXNG as the
recommended self-hosted default and Brave Search as an external
alternative. Provider abstraction lives in
backend/src/lib/websearch/.

Compliance posture:

- allow_web_search master switch defaults off, with per-workspace
  override (narrow-only on blocklist).
- Domain allowlist and blocklist at both org and workspace levels.
- Audit event for every search and every fetch, including the query
  text; URL only for fetches (body lives in the fetch cache for its
  TTL).
- Banner in chat when web search is active, alongside the existing
  external-AI banner.

SSRF defence:

- Scheme/port allowlist (http/https on 80/443 only).
- DNS resolution check that blocks RFC1918, loopback, link-local,
  ULA, cloud metadata addresses, re-checked on each redirect.
- Content-type allowlist (text/html, text/plain, application/pdf,
  application/json).
- Hard caps on bytes, chars, time, and redirect count.

Two new Postgres cache tables, ten new org_settings columns, three
new workspaces columns, one chats column for the per-chat toggle.
Single migration 0023_web_search.sql.

Includes rollout plan in five steps, risk matrix, four open
questions parked for review, and full acceptance criteria.
f96b450a docs: update Phase 10 defaults, add Phase 11 (Groups) design Claude 2026-05-16 ↗ GitHub
commit body
Phase 10 changes:

- Default provider switched from SearXNG to Brave Search per
  decision. SearXNG remains as an alternative for operators who
  prefer self-hosted.
- Blocklist is now the primary filtering mechanism with a seeded
  starter list (paste sites, *.onion, unmoderated forums). Allowlist
  demoted to "advanced" use for tightly scoped configurations.
- Migration default for web_search_provider flipped to 'brave'.
- Compose changes are now SearXNG-only and opt-in via profile.
- Open question reframed to Brave plan choice (commercial use).

Phase 11 (Groups and granular permissions) added:

- Four new tables: groups, group_members, permissions (catalogue),
  group_permissions.
- project_members and review_members extended with nullable group_id
  alongside user_id, enforced by check constraint.
- Two auto-managed system groups per workspace (All members,
  Admins) maintained by triggers on workspace_members.
- Capability set seeded by migration with default_for_role mapping
  and admin_locked flags.
- Effective permissions resolved per request via lib/access.ts and
  exposed as req.can('capability.key').
- Sharing modal accepts users or groups via a unified principal
  search endpoint.
- Audit events for group lifecycle and share/unshare with principal
  kind metadata.
- Single migration 0024_groups.sql; columns added are nullable
  (metadata-only in PG16), indexes built concurrently.
- Rollout in four steps with feature-flag option for behavioural
  rollback.

Open questions parked for review: owner-locked capabilities, cross-
workspace group sharing, residual shared_with JSONB cleanup, perf
of req.can() in chat tool dispatch.
876049f9 docs: add Phase 12 (Multi-model side-by-side) design Claude 2026-05-16 ↗ GitHub
commit body
Adds a compare mode that lets an authorised user run the same prompt
against 2 or 3 models in parallel.

Data model:

- chats gets mode ('standard'|'compare') and compare_models text[].
- chat_messages gets turn_index, branch_index, model_id, and three
  cost-related columns (input_tokens, output_tokens, cost_cents).
- Per-message cost capture is on for every chat from this phase,
  including standard chats. Cheap side-benefit feeds a new admin
  Cost summary tab.
- New model_prices reference table, admin-editable.

Backend:

- Compare-aware POST /api/chats/:id/messages fans out N provider
  calls with Promise.all and multiplexes SSE chunks tagged with
  branch_index.
- Per-branch regenerate endpoint replaces one assistant row in
  place.
- Fork-to-standard endpoint clones a compare chat's history with
  one branch's responses into a new standard chat.

Frontend:

- Mode toggle on new-chat composer. Multi-select model picker
  showing availability per model. Cost amplification note.
- Chat view renders user turns full-width and assistant turns as
  N columns side by side on desktop, tabs on mobile.
- Per-column streaming with regen and keep-this actions, error
  state with retry.

Policy and gating:

- org_settings.allow_compare_mode (default off),
  compare_mode_max_models (cap 3, must be 2 or 3),
  compare_mode_admin_only (default on).
- New chats.compare capability seeded into the Phase 11
  permissions catalogue with owner+admin default.
- allow_external_models and EXTERNAL_AI_DISABLED both gate which
  models appear in the picker. Disallowed models are not
  selectable.

Tools in compare mode are disabled in Phase 12. Memory injection
still happens (read-only path); add_memory cannot fire because
tools are off.

Single migration 0025_compare_mode.sql with one-shot backfill of
turn_index for existing chats and seeded model_prices entries.

Five open questions parked for review: per-branch regen during
streaming, RAG cost attribution (deferred to Phase 13), locked
model set, per-user concurrency cap, mobile UX.
56917d4b docs: add Phase 13 (Vector RAG with pgvector) design Claude 2026-05-16 ↗ GitHub
commit body
The foundational retrieval upgrade. Replaces the current LLM-driven
in-context document scan with chunked semantic retrieval over
pgvector with hybrid search.

Data model:

- pgvector extension on Postgres 16 (image swap to
  pgvector/pgvector:pg16).
- document_chunks with structure-aware metadata: page_number for
  PDF, heading_path text[] for DOCX, char_start/char_end for
  highlight overlays, and a stored tsvector generated column for
  the full-text arm.
- document_embeddings keyed one-to-one with chunks, single active
  embedding model at a time, vector(1024) column type pinned at
  migration time. Dimension swap via templated paired migration
  plus full reindex.
- rag_ingest_jobs as a simple Postgres-backed queue, consumed by
  an in-process worker (single backend replica assumption).
- Eight new org_settings columns covering provider, model, chunk
  shape, top-K per arm, and final top-N.

Embedding model:

- bge-m3 via Ollama as the default (1024 dim, multilingual,
  CPU-capable). nomic-embed-text as a lighter alternative.
- HF TEI as a dedicated-service option. OpenAI text-embedding-3
  family as opt-in, gated by EXTERNAL_AI_DISABLED.

Chunking:

- Token-based (~512 tokens, 64 overlap) with paragraph/sentence/
  word boundary preference and heading-forced boundaries for DOCX.
- cl100k_base tokenizer for reproducibility across models.
- Per-chunk metadata: document, version, index, page, heading
  path, character offsets.

Retrieval:

- Hybrid: vector cosine via HNSW (m=16, ef_construction=64) plus
  ts_rank_cd full-text, merged with Reciprocal Rank Fusion (k=60).
- ACL filter computed up front from Phase 11 effective
  permissions; RLS as defence-in-depth.
- search_documents tool refactored to call the new retriever;
  same external shape so the model side is unchanged. Tool output
  includes chunk_id, document_name, page, heading_path, excerpt,
  score.

Worker:

- One in-process worker, SELECT FOR UPDATE SKIP LOCKED on
  rag_ingest_jobs, retry cap of 3, batched embedding calls
  (default batch 64).
- Hooks: document upload, version creation, admin "reindex"
  endpoint.

Frontend:

- Composer scope chip ("Searching: project X (47 documents)")
  with a scope-edit modal.
- search_documents tool-call card renders the hit list with
  links that jump to document viewer with the chunk
  highlighted.
- cite-button hover preview of the chunk excerpt.
- Admin AI Policy gets a Retrieval section with provider/model
  selection, chunk/top-K knobs, queue and index stats, and
  guarded "Reindex" / "Clear and reindex" actions.

Rollout in six steps gated by a new rag_enabled org switch;
rollback at any step flips the switch back to keep the legacy
in-context tool. Compose image swap to pgvector/pgvector:pg16
documented in the operator deployment guide.

Risks captured for: dimension mismatch, HNSW build time, worker
stalls on malformed docs, chunk explosion on multi-thousand-page
PDFs, permission bypass through retrieval, stale chunks after
version change, OpenAI leak under EXTERNAL_AI_DISABLED, postgres
image change for operators.

Open questions parked: tokenizer choice, heading-aware tuning,
RRF weighting tuning UI, per-document index opt-out, cross-encoder
reranker, workspace-wide retrieval (deferred to Phase 14).
5f2e7203 docs: add Phase 14 (Knowledge collections) design Claude 2026-05-16 ↗ GitHub
commit body
The final phase. UX layer over Phase 13 retrieval. Lets users group
documents into named collections that can be scoped at chat time
via a #-mention autocomplete, and bound as the source set for
tabular reviews and workflows.

Data model:

- collections (workspace-scoped, with visibility = private |
  workspace | shared), collection_documents, collection_members
  (dual-principal pattern from Phase 11, used as a LISTING ACL only).
- chats.default_scope_kind + default_scope_id for per-chat default.
- tabular_reviews.collection_id (nullable, on delete set null).
- workflows.default_scope_kind + default_scope_id.

ACL model (Model C - hybrid intersection):

- Visibility controls who can SEE the collection.
- effectiveDocumentSet intersects collection contents with the
  caller's accessible-document set.
- Adding a document to a collection NEVER widens access. A
  collection containing docs the caller cannot read surfaces a
  count ("X of Y visible to you") without naming inaccessible
  documents.
- collection_members is listing-only; cannot widen document ACL.

System collections:

- One per project, system_kind = 'project_all', auto-maintained
  by triggers on documents and projects.
- Visible workspace-wide; effective contents per user remain
  intersected with their accessible set.
- Migration backfills system collections for every existing
  project; duplicate project names emit a warning.

Composer UX:

- #-autocomplete dropdown grouped by Collections, Projects,
  Documents; capped at 20 results from /scope-search.
- Selected scope renders as a chip with kind icon; multiple chips
  union; submitting records the scope for the assistant turn.
- Per-chat default scope chip above the composer, editable.

Tabular reviews and workflows:

- Tabular review create modal gains a Source toggle (project or
  collection).
- Workflow run modal gains a scope picker honouring the
  workflow's default scope.
- Workflow runner already takes document_ids; Phase 14 adds a
  thin scope-resolution wrapper that calls effectiveDocumentSet.

Risks captured for visibility leakage via counts, name collisions,
performance, autocomplete latency, orphan default scopes, expected
behaviour of tabular reviews when collection contents change after
review creation, and explicit assertion that shared-collection
membership does NOT widen document ACL.

Open questions parked: cross-workspace collections, bulk add via
CSV vs picker, system-collection rename on project rename
(recommended yes), synthetic "All documents in this workspace"
entry in scope picker.

Single migration 0027_collections.sql with trigger-managed system
collections and backfill.

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-464.md from inside the repo you want the changes in.

⬇ Download capture-thread-464.md