fb928a96 | docs: add Phase 10 (Web search) design | Claude | 2026-05-16 | ↗ GitHub |
commit body Covers two LLM tools (web_search, fetch_url) with SearXNG as the
recommended self-hosted default and Brave Search as an external
alternative. Provider abstraction lives in
backend/src/lib/websearch/.
Compliance posture:
- allow_web_search master switch defaults off, with per-workspace
override (narrow-only on blocklist).
- Domain allowlist and blocklist at both org and workspace levels.
- Audit event for every search and every fetch, including the query
text; URL only for fetches (body lives in the fetch cache for its
TTL).
- Banner in chat when web search is active, alongside the existing
external-AI banner.
SSRF defence:
- Scheme/port allowlist (http/https on 80/443 only).
- DNS resolution check that blocks RFC1918, loopback, link-local,
ULA, cloud metadata addresses, re-checked on each redirect.
- Content-type allowlist (text/html, text/plain, application/pdf,
application/json).
- Hard caps on bytes, chars, time, and redirect count.
Two new Postgres cache tables, ten new org_settings columns, three
new workspaces columns, one chats column for the per-chat toggle.
Single migration 0023_web_search.sql.
Includes rollout plan in five steps, risk matrix, four open
questions parked for review, and full acceptance criteria.
|
f96b450a | docs: update Phase 10 defaults, add Phase 11 (Groups) design | Claude | 2026-05-16 | ↗ GitHub |
commit body Phase 10 changes:
- Default provider switched from SearXNG to Brave Search per
decision. SearXNG remains as an alternative for operators who
prefer self-hosted.
- Blocklist is now the primary filtering mechanism with a seeded
starter list (paste sites, *.onion, unmoderated forums). Allowlist
demoted to "advanced" use for tightly scoped configurations.
- Migration default for web_search_provider flipped to 'brave'.
- Compose changes are now SearXNG-only and opt-in via profile.
- Open question reframed to Brave plan choice (commercial use).
Phase 11 (Groups and granular permissions) added:
- Four new tables: groups, group_members, permissions (catalogue),
group_permissions.
- project_members and review_members extended with nullable group_id
alongside user_id, enforced by check constraint.
- Two auto-managed system groups per workspace (All members,
Admins) maintained by triggers on workspace_members.
- Capability set seeded by migration with default_for_role mapping
and admin_locked flags.
- Effective permissions resolved per request via lib/access.ts and
exposed as req.can('capability.key').
- Sharing modal accepts users or groups via a unified principal
search endpoint.
- Audit events for group lifecycle and share/unshare with principal
kind metadata.
- Single migration 0024_groups.sql; columns added are nullable
(metadata-only in PG16), indexes built concurrently.
- Rollout in four steps with feature-flag option for behavioural
rollback.
Open questions parked for review: owner-locked capabilities, cross-
workspace group sharing, residual shared_with JSONB cleanup, perf
of req.can() in chat tool dispatch.
|
876049f9 | docs: add Phase 12 (Multi-model side-by-side) design | Claude | 2026-05-16 | ↗ GitHub |
commit body Adds a compare mode that lets an authorised user run the same prompt
against 2 or 3 models in parallel.
Data model:
- chats gets mode ('standard'|'compare') and compare_models text[].
- chat_messages gets turn_index, branch_index, model_id, and three
cost-related columns (input_tokens, output_tokens, cost_cents).
- Per-message cost capture is on for every chat from this phase,
including standard chats. Cheap side-benefit feeds a new admin
Cost summary tab.
- New model_prices reference table, admin-editable.
Backend:
- Compare-aware POST /api/chats/:id/messages fans out N provider
calls with Promise.all and multiplexes SSE chunks tagged with
branch_index.
- Per-branch regenerate endpoint replaces one assistant row in
place.
- Fork-to-standard endpoint clones a compare chat's history with
one branch's responses into a new standard chat.
Frontend:
- Mode toggle on new-chat composer. Multi-select model picker
showing availability per model. Cost amplification note.
- Chat view renders user turns full-width and assistant turns as
N columns side by side on desktop, tabs on mobile.
- Per-column streaming with regen and keep-this actions, error
state with retry.
Policy and gating:
- org_settings.allow_compare_mode (default off),
compare_mode_max_models (cap 3, must be 2 or 3),
compare_mode_admin_only (default on).
- New chats.compare capability seeded into the Phase 11
permissions catalogue with owner+admin default.
- allow_external_models and EXTERNAL_AI_DISABLED both gate which
models appear in the picker. Disallowed models are not
selectable.
Tools in compare mode are disabled in Phase 12. Memory injection
still happens (read-only path); add_memory cannot fire because
tools are off.
Single migration 0025_compare_mode.sql with one-shot backfill of
turn_index for existing chats and seeded model_prices entries.
Five open questions parked for review: per-branch regen during
streaming, RAG cost attribution (deferred to Phase 13), locked
model set, per-user concurrency cap, mobile UX.
|
56917d4b | docs: add Phase 13 (Vector RAG with pgvector) design | Claude | 2026-05-16 | ↗ GitHub |
commit body The foundational retrieval upgrade. Replaces the current LLM-driven
in-context document scan with chunked semantic retrieval over
pgvector with hybrid search.
Data model:
- pgvector extension on Postgres 16 (image swap to
pgvector/pgvector:pg16).
- document_chunks with structure-aware metadata: page_number for
PDF, heading_path text[] for DOCX, char_start/char_end for
highlight overlays, and a stored tsvector generated column for
the full-text arm.
- document_embeddings keyed one-to-one with chunks, single active
embedding model at a time, vector(1024) column type pinned at
migration time. Dimension swap via templated paired migration
plus full reindex.
- rag_ingest_jobs as a simple Postgres-backed queue, consumed by
an in-process worker (single backend replica assumption).
- Eight new org_settings columns covering provider, model, chunk
shape, top-K per arm, and final top-N.
Embedding model:
- bge-m3 via Ollama as the default (1024 dim, multilingual,
CPU-capable). nomic-embed-text as a lighter alternative.
- HF TEI as a dedicated-service option. OpenAI text-embedding-3
family as opt-in, gated by EXTERNAL_AI_DISABLED.
Chunking:
- Token-based (~512 tokens, 64 overlap) with paragraph/sentence/
word boundary preference and heading-forced boundaries for DOCX.
- cl100k_base tokenizer for reproducibility across models.
- Per-chunk metadata: document, version, index, page, heading
path, character offsets.
Retrieval:
- Hybrid: vector cosine via HNSW (m=16, ef_construction=64) plus
ts_rank_cd full-text, merged with Reciprocal Rank Fusion (k=60).
- ACL filter computed up front from Phase 11 effective
permissions; RLS as defence-in-depth.
- search_documents tool refactored to call the new retriever;
same external shape so the model side is unchanged. Tool output
includes chunk_id, document_name, page, heading_path, excerpt,
score.
Worker:
- One in-process worker, SELECT FOR UPDATE SKIP LOCKED on
rag_ingest_jobs, retry cap of 3, batched embedding calls
(default batch 64).
- Hooks: document upload, version creation, admin "reindex"
endpoint.
Frontend:
- Composer scope chip ("Searching: project X (47 documents)")
with a scope-edit modal.
- search_documents tool-call card renders the hit list with
links that jump to document viewer with the chunk
highlighted.
- cite-button hover preview of the chunk excerpt.
- Admin AI Policy gets a Retrieval section with provider/model
selection, chunk/top-K knobs, queue and index stats, and
guarded "Reindex" / "Clear and reindex" actions.
Rollout in six steps gated by a new rag_enabled org switch;
rollback at any step flips the switch back to keep the legacy
in-context tool. Compose image swap to pgvector/pgvector:pg16
documented in the operator deployment guide.
Risks captured for: dimension mismatch, HNSW build time, worker
stalls on malformed docs, chunk explosion on multi-thousand-page
PDFs, permission bypass through retrieval, stale chunks after
version change, OpenAI leak under EXTERNAL_AI_DISABLED, postgres
image change for operators.
Open questions parked: tokenizer choice, heading-aware tuning,
RRF weighting tuning UI, per-document index opt-out, cross-encoder
reranker, workspace-wide retrieval (deferred to Phase 14).
|
5f2e7203 | docs: add Phase 14 (Knowledge collections) design | Claude | 2026-05-16 | ↗ GitHub |
commit body The final phase. UX layer over Phase 13 retrieval. Lets users group
documents into named collections that can be scoped at chat time
via a #-mention autocomplete, and bound as the source set for
tabular reviews and workflows.
Data model:
- collections (workspace-scoped, with visibility = private |
workspace | shared), collection_documents, collection_members
(dual-principal pattern from Phase 11, used as a LISTING ACL only).
- chats.default_scope_kind + default_scope_id for per-chat default.
- tabular_reviews.collection_id (nullable, on delete set null).
- workflows.default_scope_kind + default_scope_id.
ACL model (Model C - hybrid intersection):
- Visibility controls who can SEE the collection.
- effectiveDocumentSet intersects collection contents with the
caller's accessible-document set.
- Adding a document to a collection NEVER widens access. A
collection containing docs the caller cannot read surfaces a
count ("X of Y visible to you") without naming inaccessible
documents.
- collection_members is listing-only; cannot widen document ACL.
System collections:
- One per project, system_kind = 'project_all', auto-maintained
by triggers on documents and projects.
- Visible workspace-wide; effective contents per user remain
intersected with their accessible set.
- Migration backfills system collections for every existing
project; duplicate project names emit a warning.
Composer UX:
- #-autocomplete dropdown grouped by Collections, Projects,
Documents; capped at 20 results from /scope-search.
- Selected scope renders as a chip with kind icon; multiple chips
union; submitting records the scope for the assistant turn.
- Per-chat default scope chip above the composer, editable.
Tabular reviews and workflows:
- Tabular review create modal gains a Source toggle (project or
collection).
- Workflow run modal gains a scope picker honouring the
workflow's default scope.
- Workflow runner already takes document_ids; Phase 14 adds a
thin scope-resolution wrapper that calls effectiveDocumentSet.
Risks captured for visibility leakage via counts, name collisions,
performance, autocomplete latency, orphan default scopes, expected
behaviour of tabular reviews when collection contents change after
review creation, and explicit assertion that shared-collection
membership does NOT widen document ACL.
Open questions parked: cross-workspace collections, bulk add via
CSV vs picker, system-collection rename on project rename
(recommended yes), synthetic "All documents in this workspace"
entry in scope picker.
Single migration 0027_collections.sql with trigger-managed system
collections and backfill.
|