cpatpa ships per-user memory and finally wires the layered system prompt

Phase 9 is two features: a `user_memories` table with model-driven capture, LRU eviction, and a pin cap enforced by a Postgres trigger; and the four-layer system-prompt assembly that the fork's architecture doc had always described but never implemented. Both are behind opposite-default flags.

chat-uiknowledge-management

The most revealing commit in this batch is the design revision (84aafcd6) that ran before any code was written. A codebase audit found that chatTools.ts hardcodes SYSTEM_PROMPT as a string constant, users.custom_instructions is stored but never injected into a prompt, and workspaces.instructions is similarly inert. The four-layer assembly described in 01-architecture.md was aspirational. The design was rewritten to actually implement it, behind a use_layered_prompt org setting that defaults true. The same audit pass also found: res.locals.userId not req.user; audit events emitted inline, no lib/audit.ts; admin page is /admin/policy not /admin/ai-policy. These corrections fed back into the implementation before a line of code shipped.

The backend (085a9d78) adds migration 0022_user_memories.sql: the user_memories table (cascade on users, SET NULL on chats, content bounded BETWEEN 1 AND 2000 chars), the enforce_user_memory_pin_cap trigger that raises errcode 23514 when pinned count would exceed 10, RLS enabled and forced, and four new org_settings columns (allow_memory default false, memory_max_per_user 1-500 default 50, memory_max_chars 1-2000 default 500, use_layered_prompt default true). lib/memories.ts provides full CRUD plus clear/export. Concurrent add_memory calls are protected by SELECT ... FOR UPDATE inside a transaction; the pin cap trigger's errcode 23514 is caught and translated to a typed PinCapExceededError, then returned as 409. Memory content is rendered inside a fenced code block with PINNED: and OTHER: subheadings so hostile content can't escape the block's markdown structure. stampLastUsed runs one batched UPDATE for all injected memory ids at the end of the request. lib/promptAssembly.ts assembles: org system prompt → workspace instructions → user custom instructions → memory block → the existing systemPromptExtra. The SYSTEM_PROMPT constant is preserved intact as the fallback path when use_layered_prompt is false.

The add_memory tool is only added to the active tool set when policy.allow_memory is true. When a memory saves, the backend emits an SSE memory_saved frame with a skippedReason taxonomy - memory_disabled, invalid_content, empty_content, too_long, cap_reached, pin_cap, internal_error. The frontend renders it as a small amber pill in the chat linking to /account.

The memories API lives at /me/memories with a per-user write-only rate limiter (30/min, env-configurable). Clear-all is POST /clear { confirm: true } rather than DELETE ?confirm=1, a deliberate CSRF-safe choice noted in the design revision. Export returns chat_id as a bare uuid without hydrating chat metadata, to avoid surfacing titles of chats the caller no longer has access to.

So what Two independently portable pieces. The layered prompt assembly is the more universally useful - it makes workspace and user instructions actually flow into the model, which is a correctness fix, not a new feature. Pull `lib/promptAssembly.ts` with the `use_layered_prompt` flag so you have a rollback gate. The memory feature is well-bounded and defensible at the implementation level (trigger-enforced pin cap, transactional LRU eviction, fenced-code-block rendering against injection), but `allow_memory` defaults off for a reason: model-captured memories in a legal context are matter-specific data leaking into long-term per-user storage. Don't enable it without deliberate policy review.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

5 commits from cpatpa/PIP, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`1a998776`	docs: add phases 9-14 to roadmap and detailed Phase 9 (Memory) design	Claude	2026-05-16	↗ GitHub
commit body Adds the six features chosen for porting from open-webui as new roadmap phases in the agreed quick-wins-first order: - Phase 9: Memory (persistent user facts) - Phase 10: Web search - Phase 11: Groups and granular permissions - Phase 12: Multi-model side-by-side - Phase 13: Vector RAG with embeddings (pgvector) - Phase 14: Knowledge collections Also adds the first detailed design doc, phase-9-memory.md, covering data model, system-prompt assembly changes, the add_memory tool with explicit guardrails against matter-specific capture, API surface, frontend UI, admin controls, audit events, retention treatment, migration plan, rollout, risks, open questions and acceptance criteria. Parked items in the roadmap are extended to record what was considered and rejected (image generation, user-uploaded plugins, anonymous chat, channels) and what is parked but may return (Whisper dictation, PWA shell).
`84aafcd6`	docs: revise Phase 9 design after pre-implementation audit	Claude	2026-05-16	↗ GitHub
commit body Audit of the existing codebase surfaced gaps the original design glossed over. Corrected the design before writing code. Architecture findings folded in: - The four-layer system-prompt assembly described in 01-architecture.md is NOT implemented today. chatTools.ts:85 hardcodes SYSTEM_PROMPT as a constant. users.custom_instructions is stored but never read into a prompt. workspaces.instructions is stored but never read. systemPromptExtra is only used by projectChat.ts for document context. Phase 9 now builds the full assembly chain in a new promptAssembly.ts helper, behind a use_layered_prompt feature flag (default true) on org_settings for emergency rollback. - Two chat code paths exist (routes/chat.ts and routes/projectChat.ts). Both call the new builder. - Auth uses res.locals.userId / res.locals.userRole, not req.user. Updated the API surface notes accordingly. - Audit events are emitted inline (INSERT INTO audit_events ...) following the lib/users.ts pattern. The phantom lib/audit.ts reference is removed from the file list. - pipApi exports flat functions (listProjects, createProject, ...). Updated the planned API to listMemories, createMemory, updateMemory, deleteMemory, clearMemories, exportMemories. - The admin page is /admin/policy/, not /admin/ai-policy/. Concrete safety corrections: - LRU eviction under concurrent add_memory calls now runs inside a single transaction with SELECT ... FOR UPDATE on the user's rows, eliminating the race where two callers both pass the count check. - Pin cap (10) is enforced by a Postgres trigger (enforce_user_memory_pin_cap) at INSERT and UPDATE OF pinned. Raises 23514 when exceeded. App layer translates to 409. - Memory content is rendered as a fenced code block in the system prompt with explicit PINNED: / OTHER: subheadings. Hostile content (e.g. lines starting with ## or ---) cannot alter the document's markdown structure. - DELETE /api/me/memories?confirm=1 replaced with POST /api/me/memories/clear body { confirm: true }. CSRF-safe: a JSON-body POST is not issuable cross-origin without explicit fetch. - POST /api/me/memories forces source='user' server-side and ignores any source field in the body. - Export endpoint returns chat_id as a bare uuid; does not hydrate chat title or other chat metadata, to avoid leaking metadata of chats the caller can no longer access. - Audit actor for model-driven memory.create is the chat owner; metadata carries actor_is_system: true and the chat_id. Migration corrected: - Adds touch_updated_at trigger for the new table, matching the convention in 0002_users.sql, 0006_projects.sql, etc. - Adds the pin-cap trigger function and trigger. - Adds RLS enable + force + revoke from anon/authenticated, in line with 0011_rls.sql. - Adds use_layered_prompt to org_settings as the rollback switch. - Confirmed next migration number is 0022 (last is 0021_branding). LlmPolicy interface gains allow_memory, memory_max_per_user, memory_max_chars, and use_layered_prompt loaded from org_settings. loadLlmPolicy() selects the new columns.
`085a9d78`	feat(backend): Phase 9 memory + layered system prompt	Claude	2026-05-16	↗ GitHub
commit body Implements the Phase 9 memory feature and the four-layer system-prompt assembly the architecture doc describes. Both are gated by org_settings switches that default in opposite directions: use_layered_prompt defaults true (so the new assembly is active immediately); allow_memory defaults false (so memory only activates when an admin opts in). Migration 0022_user_memories.sql: - user_memories table with cascade on users and SET NULL on chats. - touch_updated_at trigger matching 0002_users.sql. - enforce_user_memory_pin_cap trigger that raises 23514 when the 10-pin-per-user cap would be exceeded. - RLS enabled + forced + revoked from anon/authenticated. - org_settings gains allow_memory, memory_max_per_user (1-500, default 50), memory_max_chars (1-2000, default 500), and use_layered_prompt (default true). Two CHECK constraints bound the integer limits. lib/memories.ts: - listMemoriesForUser, createUserMemory, createModelMemory, updateUserMemory, deleteUserMemory, clearUserMemories. - All mutations run inside a transaction with SELECT ... FOR UPDATE on the user's rows, eliminating the LRU eviction race when two concurrent add_memory calls arrive at the cap. - isPinCapViolation translates the trigger's 23514 errcode into a typed PinCapExceededError. - getInjectableMemoryBlock renders the memory list as a fenced code block with PINNED:/OTHER: subheadings and a hallucination reminder, defusing markdown injection (e.g. memory content starting with ## or ---). - stampLastUsed batches one UPDATE for the ids that were injected. lib/promptAssembly.ts: - buildLayeredSystemPrompt assembles: org_settings.org_system_prompt (or fallback) -> workspace.instructions (when present) -> users. custom_instructions (when allow_user_instructions) -> memory block (when allow_memory) -> OPERATIONAL_PROMPT. - Returns the assembled string and the memory ids used so the caller can stamp last_used_at after the response. chatTools.ts: - Split the hardcoded SYSTEM_PROMPT into the legacy constant (kept intact for the use_layered_prompt=false rollback path) and a new OPERATIONAL_PROMPT export containing only the technical conventions (citation markers, DOCX rules, workflows, doc naming). The layered builder appends OPERATIONAL_PROMPT last so per-user/workspace policy is never instructed away. - buildMessages now accepts a systemPromptOverride parameter; when set it replaces SYSTEM_PROMPT for that turn. doc-availability and systemPromptExtra still append. - New MEMORY_TOOLS export with the add_memory tool, only added to the active tool set when memoryContext.policy.allow_memory is true. - runToolCalls dispatches add_memory through handleAddMemoryCall, which validates content, calls createModelMemory atomically, and returns a saved/skipped result. Skipped reasons: memory_disabled, invalid_content, empty_content, too_long, cap_reached, pin_cap, internal_error. - AssistantEvent union gains memory_saved and memory_skipped. - Saved memories emit an SSE memory_saved frame so the frontend can render the in-chat pill. llmPolicy.ts: - LlmPolicy interface extended with allow_memory, memory_max_per_user, memory_max_chars, use_layered_prompt, org_system_prompt, allow_user_instructions. - loadLlmPolicy selects all six new columns; defaults handle the rollback case where org_settings is missing the new columns. Routes (chat.ts and projectChat.ts): - Both code paths now call buildLayeredSystemPrompt when use_layered_prompt is true, pass the result as systemPromptOverride to buildMessages, and call stampLastUsed in finally for the memory ids that were used. Assembly failure falls back to the legacy path rather than blocking the chat. - memoryContext { chatId, policy } is threaded into runLLMStream when allow_memory is true. routes/memories.ts: - GET /me/memories list (always permitted; writes gated by allow_memory). - POST /me/memories create (source forced to 'user'). - PATCH /me/memories/:id update content/pinned (ownership via WHERE user_id, pin cap via trigger). - DELETE /me/memories/:id delete one. - POST /me/memories/clear delete all, body { confirm: true }. CSRF-safe (JSON-body POST not issuable cross-origin without explicit fetch). - GET /me/memories/export JSON export. chat_id returned as bare uuid; chat metadata not hydrated to avoid leaking metadata of chats the caller no longer has access to. UUID_RE guard on PATCH and DELETE returns 404 rather than letting "clear"/"export" reach the DB as a bad uuid. httpErrors.ts: new sendConflict(res, detail) for 409 responses, used by the pin-cap path. index.ts: mounts /me/memories with a per-user write-only limiter (30/min default, env-configurable via RATE_LIMIT_MEMORY_WRITE_*). Reads share the general limiter. Type-check: npx tsc --noEmit passes clean. Frontend, admin UI, and acceptance tests follow in subsequent commits.
`792fc3ec`	feat(frontend): Phase 9 memory UI and admin policy controls	Claude	2026-05-16	↗ GitHub
commit body Backend additions: - GET /me now returns a policy slice { allow_memory, memory_max_per_user, memory_max_chars, allow_user_instructions } so frontend panels can decide whether to render without an extra request. - PATCH /admin/org-settings accepts allow_memory, memory_max_per_user, memory_max_chars, and use_layered_prompt. Memory cap and char limits are bounds-checked at the route layer (1-500 and 1-2000 respectively) so admins see a clean 400 rather than a DB CHECK violation. Frontend additions: - frontend/src/app/(pages)/account/MemoriesPanel.tsx: renders nothing when allow_memory is false at the org level. When on, shows the list ordered as the injector orders it (pinned first, then by last_used_at descending), with add/edit/pin/unpin/delete actions and a confirm-twice Clear all. Export downloads the JSON blob from /me/memories/export. - Pin toggle is disabled when pinned count is at 10 (server-side trigger enforces, frontend just prevents the click). - Char counter and Add button are gated on the org's memory_max_chars and memory_max_per_user. - Mounted on the existing /account page below CustomInstructionsPanel. - shared/types.ts AssistantEvent union gains memory_saved with preview text. - useAssistantChat.ts demuxes the SSE memory_saved event into pushEvent. - AssistantMessage.tsx renders the memory_saved event as a small amber pill linking to /account ("🧠 Memory saved: ..."). No separate component; the pill is plain markup colocated with the other event renderers. Admin policy page (/admin/policy): - New "Memory" section with: allow_memory toggle, memory_max_per_user input, memory_max_chars input, and a use_layered_prompt toggle. Inline help text explains the layered-prompt switch is the emergency rollback for the new assembly chain and should be left on unless troubleshooting. pipApi.ts OrgSettings interface extended with allow_memory, memory_max_per_user, memory_max_chars, use_layered_prompt; adminUpdateOrgSettings payload widens to accept the same. Type-check: both backend and frontend `tsc --noEmit` pass clean.
`e9b428e5`	Merge Phase 9: Memory + layered system prompt	Claude	2026-05-16	↗ GitHub
commit body Brings in the design docs for Phases 9-14, the revised Phase 9 design after the pre-implementation audit, and the full implementation of Phase 9 (memory) plus the four-layer system-prompt assembly that the architecture doc has been describing aspirationally. Both new features are gated by org_settings switches that default in opposite directions: - use_layered_prompt defaults true (assembly active immediately). - allow_memory defaults false (memory only activates when an admin opts in via Admin -> Policy). Migration 0022_user_memories.sql ships with the table, triggers (touch_updated_at, 10-pin cap), RLS, and four new org_settings columns with CHECK constraints. The migration runner applies it on backend boot. No behaviour changes for end users until the admin turns memory on. The layered prompt is functionally equivalent to the legacy SYSTEM_PROMPT for an account with no workspace instructions, no user custom instructions, and no memories - which is everyone on day one.

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-463.md from inside the repo you want the changes in.

⬇ Download capture-thread-463.md