docs: add Phase 14 (Knowledge collections) design
The final phase. UX layer over Phase 13 retrieval. Lets users group
documents into named collections that can be scoped at chat time
via a #-mention autocomplete, and bound as the source set for
tabular reviews and workflows.
Data model:
- collections (workspace-scoped, with visibility = private |
workspace | shared), collection_documents, collection_members
(dual-principal pattern from Phase 11, used as a LISTING ACL only).
- chats.default_scope_kind + default_scope_id for per-chat default.
- tabular_reviews.collection_id (nullable, on delete set null).
- workflows.default_scope_kind + default_scope_id.
ACL model (Model C - hybrid intersection):
- Visibility controls who can SEE the collection.
- effectiveDocumentSet intersects collection contents with the
caller's accessible-document set.
- Adding a document to a collection NEVER widens access. A
collection containing docs the caller cannot read surfaces a
count ("X of Y visible to you") without naming inaccessible
documents.
- collection_members is listing-only; cannot widen document ACL.
System collections:
- One per project, system_kind = 'project_all', auto-maintained
by triggers on documents and projects.
- Visible workspace-wide; effective contents per user remain
intersected with their accessible set.
- Migration backfills system collections for every existing
project; duplicate project names emit a warning.
Composer UX:
- #-autocomplete dropdown grouped by Collections, Projects,
Documents; capped at 20 results from /scope-search.
- Selected scope renders as a chip with kind icon; multiple chips
union; submitting records the scope for the assistant turn.
- Per-chat default scope chip above the composer, editable.
Tabular reviews and workflows:
- Tabular review create modal gains a Source toggle (project or
collection).
- Workflow run modal gains a scope picker honouring the
workflow's default scope.
- Workflow runner already takes document_ids; Phase 14 adds a
thin scope-resolution wrapper that calls effectiveDocumentSet.
Risks captured for visibility leakage via counts, name collisions,
performance, autocomplete latency, orphan default scopes, expected
behaviour of tabular reviews when collection contents change after
review creation, and explicit assertion that shared-collection
membership does NOT widen document ACL.
Open questions parked: cross-workspace collections, bulk add via
CSV vs picker, system-collection rename on project rename
(recommended yes), synthetic "All documents in this workspace"
entry in scope picker.
Single migration 0027_collections.sql with trigger-managed system
collections and backfill.
| Repository | cpatpa/PIP |
|---|---|
| Author | Claude <noreply@anthropic.com> |
| Authored | |
| Parents | 56917d4b |
| Stats | 1 file changed , +651 |
| Part of | Phases 10-14 - design docs for web search, groups, multi-model, vector RAG, knowledge collections |
Capture this commit into my fork
Download a Markdown prompt that tells Claude how to port this
exact commit into your working tree. Run it via
claude -p < capture-commit-5f2e7203.md
from inside the repo you want the change in.