Lef-F/mike#2: Cherry-pick upstream + post-merge code review fixes (security, dedup, MCP encryption)

🟢 open · #2 · Lef-F/mike ← Lef-F/mike · opened 2mo ago by Lef-F · self · +7,005-2,678 across 69 files · ↗ on GitHub

From the PR description

Summary

Two layers of work on top of main:

Cherry-picked 8 upstream PRs from willchen96/mike (six security/correctness fixes, OpenRouter, MCP Connectors). Original authorship preserved on every commit; PR provenance in each commit body.
Applied a 13-commit code review of the merged result - 4 critical security fixes the upstream PRs left exposed (or introduced), 8 real bugs, and 5 code-quality consolidations including encrypting MCP credentials at rest with the same AES-256-GCM layer PR #42 added for LLM keys.

Layer 1 - Upstream cherry-picks

Commit	PR	Author	What
`575b41d`	#21	@Metbcy	`fix(security): fail fast when DOWNLOAD_SIGNING_SECRET missing`
`e573204`	#29	@fayerman-source	`fix(security): remove persisted raw Claude stream log`
`6c488bd`	#28	@fayerman-source	`fix(projects): validate folder ownership before mutations`
`8a05fe2`	#2	@ryanmcdonough	`fix(chat): require project access on chat creation endpoint`
`1cdbd33`	#11	@becker-charles	`feat: Add OpenRouter as third LLM provider`
`284890d`	#32	@ZachLaik	`feat(mcp): add user-configurable Connectors with OAuth 2.1` (squashed from 3)
`4f988f8`	#13	@ryanmcdonough	`fix(security): add RLS policies to projects, chats, chat_messages`
`9979566`	#42	@kveton	`fix(security): harden data access, document uploads, secret handling`

git log --format='%an' shows the original author for every cherry-picked commit. Five chore(self-host): commits sit alongside (f4d8045, 11da4b9, 9f40245, c5e6141, 597a219) where the cherry-picks needed the docker stack's compose / migrations / Dockerfile to learn about new env vars and migration filename collisions.

Layer 2 - Code review fixes

A six-agent review of the merged tree turned up critical security holes the upstream PRs left exposed and a handful of real bugs. All landed as fix-up commits attributed to us (the upstream-attributed commits are unchanged).

Tier 1 - Critical security (`c5741fc`, `d5e0f74`, `da0bc6d`, `31c1817`)

mcp/oauth.ts had ?? \"dev-secret\" fallback for the HMAC signing secret, re-opening the exact vulnerability commit 575b41d had just closed for download tokens. Forge an OAuth state token for any {user_id, server_id} and complete the MCP OAuth flow as a victim. Now reuses lib/downloadTokens.ts's getSigningSecret() helper - single throw-on-missing source of truth.
004_security_lockdown.sql skipped user_mcp_servers - the table holding headers (Bearer tokens) and oauth_tokens. Every other user-data table was locked down behind service-role-only; the most secret table was exempt. Now revoked in both 004 and the matching block in 000_one_shot_schema.sql.
mcpServers.ts URL allowlist permitted SSRF. Allowed all HTTPS hosts and all http://localhost/127.0.0.1. From the backend container, http://garage:3900 and http://10.0.0.5/admin were reachable through. Added IP-literal screens for 0/8, 10/8, 127/8, 169.254/16, 172.16/12, 192.168/16, IPv6 ::1/fc00::/7/fe80::/10/::, and single-label hostnames. Gated behind MCP_ALLOW_PRIVATE_HOSTS=true so laptop devs keep the permissive default. Same commit also fixes a token-leak bug: PATCH /user/mcp-servers/:id would let url change while keeping oauth_tokens from the previous origin.
No size cap on MCP tool output. A misbehaving connector returning multi-megabyte responses would blow the LLM context window, run up token cost, and DoS the chat. Added MCP_MAX_TOOL_BYTES cap (default 64 KB) with a truncation marker, and replaced the fragile content.startsWith(\"MCP tool '<name>' \") ok-detection with a structured { ok, content, truncated } return shape.

Tier 2 - Real bugs (`908f8e1`, `3d46ed4`, `2740fa8`, `a96867e`)

tabular.ts .contains(\"shared_with\", [array]) - same jsonb form bug we fixed in projects.ts / access.ts; the catch silently logged it as "column hasn't been migrated yet", masking it. Direct-share standalone tabular reviews never appeared for recipients. Plus case-insensitive email comparisons in three sites: stored shared_with is lowercased on PATCH/POST but reads passed JWT email raw, and providers can issue mixed-case email claims.
Three OpenRouter bugs in PR #11: unconditional console.log per SSE chunk (production log flood), tool-call id never backfilled when the model defers it to a later delta (multi-turn tool use breaks), and anthropic-via-openrouter model IDs used 4.6/4.7 (period) instead of OpenRouter's 4-6/4-7 (hyphen) slug - every call would 404.
PDF magic-byte check pinned to offset 0 rejected legitimate PDFs with a leading newline (PDF spec allows %PDF- anywhere in first 1024 bytes). Plus the DOCX zip-bomb guard read JSZip's _data.uncompressedSize private API which returns undefined for store-only entries - store-compressed bombs bypassed the check. Now decompresses each entry via the public entry.async(\"nodebuffer\") and accumulates against a 100 MB cap.
Three frontend bugs from PR #42: ensureProfile() not awaited in AuthContext → first-load race where a 404 makes the UI flash a fallback profile; TabularReviewView built apiKeys without openrouterApiKey so OpenRouter tabular runs were silently blocked; tooltip on disabled model row said "Add a Gemini key" for OpenRouter models.

Tier 3 - Code quality / dedup (`ca110ef`, `5355113`, `701535b`, `693f133`, `c4ff092`)

handleDocumentUpload was duplicated verbatim across documents.ts and projects.ts (-212 lines after consolidation).
Boolean→\"configured\" sentinel mapping was repeated four times across ChatInput, TRChatPanel, TabularReviewView, account/models (the source of the TabularReviewView openrouter regression). Extracted apiKeysFromProfile(profile) helper.
Three-fold provider-key encrypt/decrypt mapping in userSettings.ts and routes/user.ts collapsed into shared exports from apiKeys.ts (PROVIDER_KEY_COLUMNS, buildPlaintextUpgrades, encryptApiKeyInputs).
Three different b64url implementations (downloadTokens.ts, apiKeys.ts, mcp/oauth.ts) consolidated to Node's native Buffer.toString(\"base64url\") (-36 lines).
MCP credential at-rest encryption (the deferred-by-design item in migration 003's comment): headers, oauth_tokens, and oauth_code_verifier now go through the same AES-256-GCM enc:v1: envelope as LLM keys. Lazy-upgrade pattern: on first read, plaintext rows are decrypted as identity and a fire-and-forget UPDATE encrypts them. New helpers encryptJsonBlob / decryptJsonBlob / needsJsonBlobUpgrade in apiKeys.ts. Test suite went 11→17.

Verification

Smoke-tested the merged stack end-to-end on http://localhost:

All long-lived containers healthy; both init containers exited 0
/backend/health, /auth/v1/health, /backend/projects, /backend/chat, /backend/user/mcp-servers all 200
Gemini chat streamed PONG4 end-to-end
Existing plaintext Gemini API key opportunistically encrypted on first authenticated request (DB row went from 39-char AIzaSyDRmb... to 99-char enc:v1:...)
tsc --noEmit -p backend and tsc --noEmit -p frontend both clean
npm test 17/17 pass

Credit

Every cherry-picked commit preserves its original author in git log. The 13 fix-up commits in Layer 2 are attributed to us, not to the upstream authors - they're our reactions to a code review of the merged result, not changes to upstream's intent. Several of the Tier 1 / Tier 2 fixes (especially the MCP dev-secret fallback, user_mcp_servers lockdown gap, OpenRouter model-slug bugs, jsonb contains form in tabular) are good upstream PR candidates independently of this fork's docker stack.

Cherry-pick upstream + post-merge code review fixes (security, dedup, MCP encryption)

From the PR description

Summary

Layer 1 - Upstream cherry-picks

Layer 2 - Code review fixes

Tier 1 - Critical security (`c5741fc`, `d5e0f74`, `da0bc6d`, `31c1817`)

Tier 2 - Real bugs (`908f8e1`, `3d46ed4`, `2740fa8`, `a96867e`)

Tier 3 - Code quality / dedup (`ca110ef`, `5355113`, `701535b`, `693f133`, `c4ff092`)

Verification

Credit

Our analysis

Capture this PR into my fork

Cherry-pick upstream + post-merge code review fixes (security, dedup, MCP encryption)

From the PR description

Summary

Layer 1 - Upstream cherry-picks

Layer 2 - Code review fixes

Tier 1 - Critical security (c5741fc, d5e0f74, da0bc6d, 31c1817)

Tier 2 - Real bugs (908f8e1, 3d46ed4, 2740fa8, a96867e)

Tier 3 - Code quality / dedup (ca110ef, 5355113, 701535b, 693f133, c4ff092)

Verification

Credit

Our analysis

Capture this PR into my fork

Tier 1 - Critical security (`c5741fc`, `d5e0f74`, `da0bc6d`, `31c1817`)

Tier 2 - Real bugs (`908f8e1`, `3d46ed4`, `2740fa8`, `a96867e`)

Tier 3 - Code quality / dedup (`ca110ef`, `5355113`, `701535b`, `693f133`, `c4ff092`)