Security fixes: filename sanitization, timing-safe HMAC, HKDF salts, RLS deny-all

Dshamir integrates four upstream PRs addressing concrete vulnerabilities: a prompt-injection vector via crafted filenames, a length-oracle side channel in download token verification, weak API key encryption, and an open PostgREST data plane with no row-level access control.

securityinfrastructure

The filename sanitization (af4ed2d, PR #158) adds an 8-line sanitize.ts that strips control characters, collapses whitespace, NFC-normalizes, and truncates strings before they reach LLM system prompts. The target is uploaded document filenames - a PDF named with embedded newlines or instruction text could manipulate the system prompt. The diff shows ~5400 changed lines in chatTools.ts, but that is almost entirely CRLF normalization; the functional change is the sanitizeLlmInput() call at the injection points.

The crypto fixes (52f47ba) address three separate issues. Download token verification was calling crypto.timingSafeEqual(Buffer.from(a), Buffer.from(b)) after an early-exit length check - a timing oracle that leaks signature length. The fix pads both buffers to Math.max(len_a, len_b) before the comparison, then checks length equality separately. The API key encryption moved from SHA-256 key derivation to HKDF with a random 16-byte per-row salt stored in a new salt column on UserApiKey: crypto.hkdfSync("sha256", secret, salt, "ailegal-api-key-v1", 32). Legacy rows without a salt fall back to the old SHA-256 path for backward compatibility. Case-insensitive email comparison in shared_with project access (PR #79) prevents access bypasses via email casing.

The data-plane hardening (76fedfc) adds an RLS deny-all SQL script that runs ALTER TABLE ... ENABLE ROW LEVEL SECURITY and FORCE ROW LEVEL SECURITY on every non-Prisma table in the public schema, plus an event trigger that auto-applies RLS to any future table. This specifically blocks the PostgREST anon role from reading anything directly. Prisma's service-role connection bypasses RLS and is unaffected. The same commit adds withStreamTimeout() - a Promise.race wrapper with a 180-second default - applied to runLLMStream() to prevent hung SSE connections, and caps document_ids in the download-zip endpoint at 50 to prevent memory exhaustion.

The keyset pagination (eb70e06, PR #110) adds a before cursor parameter (ISO timestamp) to GET /chat for scrolling through chat history without offset drift. Zod validation on POST /projects/:id/chat (PR #155) also lands here, using the zodProjectChatBody schema.

So what These fixes address real vulnerabilities in code shared with upstream. The filename sanitization, timing-safe comparison, and stream timeout are each small and straightforward to import. The RLS script only applies if you have the Prisma service-role architecture in place - it actively breaks PostgREST anon access to all tables, so do not apply it without understanding what that cuts off.

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

4 commits from Dshamir/AI-Legal, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`af4ed2db`	fix(security): sanitize filenames before LLM prompt interpolation	Dshamir	2026-05-23	↗ GitHub
commit body Untrusted filenames from uploaded documents were interpolated directly into LLM system prompts without sanitization, enabling prompt injection via crafted PDF/DOCX filenames. Add sanitizeLlmInput() to strip control characters, collapse newlines, truncate, and NFC-normalize all user-supplied values before they enter the prompt. Addresses upstream PR #158. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`52f47ba4`	fix(security): timing-safe HMAC, HKDF per-row salt, case-insensitive email	Dshamir	2026-05-23	↗ GitHub
commit body - downloadTokens: pad buffers to equal length before timingSafeEqual to eliminate length-oracle side channel (PR #81) - keyRotation: add HKDF key derivation with random 16-byte per-row salt; existing rows without salt decrypt via legacy SHA-256 path; all new encryptions use HKDF (PR #76) - projects: use case-insensitive comparison for shared_with email in GET /projects/:projectId, matching access.ts pattern (PR #79) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`76fedfc6`	fix(infra): RLS deny-all on PostgREST, SSE stream timeout, zip doc cap	Dshamir	2026-05-23	↗ GitHub
commit body - Add RLS to all public tables with deny-all default policy and auto-enable event trigger for future tables. PostgREST anon role can no longer read any data. Prisma service-role bypasses RLS (PR #145) - Wrap runLLMStream() in Promise.race with 180s configurable timeout; sends SSE error event on timeout and closes connection (PR #112) - Cap download-zip document_ids array at 50 to prevent memory exhaustion from unbounded batch downloads (PR #111) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`eb70e06a`	feat(infra): chat cursor pagination, projectChat Zod validation	Dshamir	2026-05-23	↗ GitHub
commit body - GET /chat now accepts a `before` cursor (ISO timestamp) for keyset pagination. Default limit set to 50 (PR #110). - POST /projects/:projectId/chat validates request body via Zod schema with the validate() middleware (PR #155). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

SHA

Subject

Author

Date

af4ed2db

fix(security): sanitize filenames before LLM prompt interpolation

Dshamir

2026-05-23

↗ GitHub

commit body

Untrusted filenames from uploaded documents were interpolated directly
into LLM system prompts without sanitization, enabling prompt injection
via crafted PDF/DOCX filenames. Add sanitizeLlmInput() to strip control
characters, collapse newlines, truncate, and NFC-normalize all
user-supplied values before they enter the prompt.

Addresses upstream PR #158.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

52f47ba4

fix(security): timing-safe HMAC, HKDF per-row salt, case-insensitive email

Dshamir

2026-05-23

↗ GitHub

commit body

- downloadTokens: pad buffers to equal length before timingSafeEqual
  to eliminate length-oracle side channel (PR #81)
- keyRotation: add HKDF key derivation with random 16-byte per-row
  salt; existing rows without salt decrypt via legacy SHA-256 path;
  all new encryptions use HKDF (PR #76)
- projects: use case-insensitive comparison for shared_with email
  in GET /projects/:projectId, matching access.ts pattern (PR #79)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

76fedfc6

fix(infra): RLS deny-all on PostgREST, SSE stream timeout, zip doc cap

Dshamir

2026-05-23

↗ GitHub

commit body

- Add RLS to all public tables with deny-all default policy and
  auto-enable event trigger for future tables. PostgREST anon role
  can no longer read any data. Prisma service-role bypasses RLS (PR #145)
- Wrap runLLMStream() in Promise.race with 180s configurable timeout;
  sends SSE error event on timeout and closes connection (PR #112)
- Cap download-zip document_ids array at 50 to prevent memory
  exhaustion from unbounded batch downloads (PR #111)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

eb70e06a

feat(infra): chat cursor pagination, projectChat Zod validation

Dshamir

2026-05-23

↗ GitHub

commit body

- GET /chat now accepts a `before` cursor (ISO timestamp) for keyset
  pagination. Default limit set to 50 (PR #110).
- POST /projects/:projectId/chat validates request body via Zod
  schema with the validate() middleware (PR #155).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-526.md from inside the repo you want the changes in.