Cherry-pick upstream + post-merge code review fixes (security, dedup, MCP encryption)

🟢 open · #2 · Lef-F/mike ← Lef-F/mike · opened 18d ago by Lef-F · self · +7,005-2,678 across 69 files · ↗ on GitHub

From the PR description

Summary

Two layers of work on top of main:

  1. Cherry-picked 8 upstream PRs from willchen96/mike (six security/correctness fixes, OpenRouter, MCP Connectors). Original authorship preserved on every commit; PR provenance in each commit body.
  2. Applied a 13-commit code review of the merged result - 4 critical security fixes the upstream PRs left exposed (or introduced), 8 real bugs, and 5 code-quality consolidations including encrypting MCP credentials at rest with the same AES-256-GCM layer PR #42 added for LLM keys.

Layer 1 - Upstream cherry-picks

Commit PR Author What
575b41d #21 @Metbcy fix(security): fail fast when DOWNLOAD_SIGNING_SECRET missing
e573204 #29 @fayerman-source fix(security): remove persisted raw Claude stream log
6c488bd #28 @fayerman-source fix(projects): validate folder ownership before mutations
8a05fe2 #2 @ryanmcdonough fix(chat): require project access on chat creation endpoint
1cdbd33 #11 @becker-charles feat: Add OpenRouter as third LLM provider
284890d #32 @ZachLaik feat(mcp): add user-configurable Connectors with OAuth 2.1 (squashed from 3)
4f988f8 #13 @ryanmcdonough fix(security): add RLS policies to projects, chats, chat_messages
9979566 #42 @kveton fix(security): harden data access, document uploads, secret handling

git log --format='%an' shows the original author for every cherry-picked commit. Five chore(self-host): commits sit alongside (f4d8045, 11da4b9, 9f40245, c5e6141, 597a219) where the cherry-picks needed the docker stack's compose / migrations / Dockerfile to learn about new env vars and migration filename collisions.

Layer 2 - Code review fixes

A six-agent review of the merged tree turned up critical security holes the upstream PRs left exposed and a handful of real bugs. All landed as fix-up commits attributed to us (the upstream-attributed commits are unchanged).

Tier 1 - Critical security (c5741fc, d5e0f74, da0bc6d, 31c1817)

  • mcp/oauth.ts had ?? \"dev-secret\" fallback for the HMAC signing secret, re-opening the exact vulnerability commit 575b41d had just closed for download tokens. Forge an OAuth state token for any {user_id, server_id} and complete the MCP OAuth flow as a victim. Now reuses lib/downloadTokens.ts's getSigningSecret() helper - single throw-on-missing source of truth.
  • 004_security_lockdown.sql skipped user_mcp_servers - the table holding headers (Bearer tokens) and oauth_tokens. Every other user-data table was locked down behind service-role-only; the most secret table was exempt. Now revoked in both 004 and the matching block in 000_one_shot_schema.sql.
  • mcpServers.ts URL allowlist permitted SSRF. Allowed all HTTPS hosts and all http://localhost/127.0.0.1. From the backend container, http://garage:3900 and http://10.0.0.5/admin were reachable through. Added IP-literal screens for 0/8, 10/8, 127/8, 169.254/16, 172.16/12, 192.168/16, IPv6 ::1/fc00::/7/fe80::/10/::, and single-label hostnames. Gated behind MCP_ALLOW_PRIVATE_HOSTS=true so laptop devs keep the permissive default. Same commit also fixes a token-leak bug: PATCH /user/mcp-servers/:id would let url change while keeping oauth_tokens from the previous origin.
  • No size cap on MCP tool output. A misbehaving connector returning multi-megabyte responses would blow the LLM context window, run up token cost, and DoS the chat. Added MCP_MAX_TOOL_BYTES cap (default 64 KB) with a truncation marker, and replaced the fragile content.startsWith(\"MCP tool '<name>' \") ok-detection with a structured { ok, content, truncated } return shape.

Tier 2 - Real bugs (908f8e1, 3d46ed4, 2740fa8, a96867e)

  • tabular.ts .contains(\"shared_with\", [array]) - same jsonb form bug we fixed in projects.ts / access.ts; the catch silently logged it as "column hasn't been migrated yet", masking it. Direct-share standalone tabular reviews never appeared for recipients. Plus case-insensitive email comparisons in three sites: stored shared_with is lowercased on PATCH/POST but reads passed JWT email raw, and providers can issue mixed-case email claims.
  • Three OpenRouter bugs in PR #11: unconditional console.log per SSE chunk (production log flood), tool-call id never backfilled when the model defers it to a later delta (multi-turn tool use breaks), and anthropic-via-openrouter model IDs used 4.6/4.7 (period) instead of OpenRouter's 4-6/4-7 (hyphen) slug - every call would 404.
  • PDF magic-byte check pinned to offset 0 rejected legitimate PDFs with a leading newline (PDF spec allows %PDF- anywhere in first 1024 bytes). Plus the DOCX zip-bomb guard read JSZip's _data.uncompressedSize private API which returns undefined for store-only entries - store-compressed bombs bypassed the check. Now decompresses each entry via the public entry.async(\"nodebuffer\") and accumulates against a 100 MB cap.
  • Three frontend bugs from PR #42: ensureProfile() not awaited in AuthContext → first-load race where a 404 makes the UI flash a fallback profile; TabularReviewView built apiKeys without openrouterApiKey so OpenRouter tabular runs were silently blocked; tooltip on disabled model row said "Add a Gemini key" for OpenRouter models.

Tier 3 - Code quality / dedup (ca110ef, 5355113, 701535b, 693f133, c4ff092)

  • handleDocumentUpload was duplicated verbatim across documents.ts and projects.ts (-212 lines after consolidation).
  • Boolean→\"configured\" sentinel mapping was repeated four times across ChatInput, TRChatPanel, TabularReviewView, account/models (the source of the TabularReviewView openrouter regression). Extracted apiKeysFromProfile(profile) helper.
  • Three-fold provider-key encrypt/decrypt mapping in userSettings.ts and routes/user.ts collapsed into shared exports from apiKeys.ts (PROVIDER_KEY_COLUMNS, buildPlaintextUpgrades, encryptApiKeyInputs).
  • Three different b64url implementations (downloadTokens.ts, apiKeys.ts, mcp/oauth.ts) consolidated to Node's native Buffer.toString(\"base64url\") (-36 lines).
  • MCP credential at-rest encryption (the deferred-by-design item in migration 003's comment): headers, oauth_tokens, and oauth_code_verifier now go through the same AES-256-GCM enc:v1: envelope as LLM keys. Lazy-upgrade pattern: on first read, plaintext rows are decrypted as identity and a fire-and-forget UPDATE encrypts them. New helpers encryptJsonBlob / decryptJsonBlob / needsJsonBlobUpgrade in apiKeys.ts. Test suite went 11→17.

Verification

Smoke-tested the merged stack end-to-end on http://localhost:

  • All long-lived containers healthy; both init containers exited 0
  • /backend/health, /auth/v1/health, /backend/projects, /backend/chat, /backend/user/mcp-servers all 200
  • Gemini chat streamed PONG4 end-to-end
  • Existing plaintext Gemini API key opportunistically encrypted on first authenticated request (DB row went from 39-char AIzaSyDRmb... to 99-char enc:v1:...)
  • tsc --noEmit -p backend and tsc --noEmit -p frontend both clean
  • npm test 17/17 pass

Credit

Every cherry-picked commit preserves its original author in git log. The 13 fix-up commits in Layer 2 are attributed to us, not to the upstream authors - they're our reactions to a code review of the merged result, not changes to upstream's intent. Several of the Tier 1 / Tier 2 fixes (especially the MCP dev-secret fallback, user_mcp_servers lockdown gap, OpenRouter model-slug bugs, jsonb contains form in tabular) are good upstream PR candidates independently of this fork's docker stack.

Our analysis

Cherry-pick eight upstream PRs and harden the merged result — read the full analysis →

Think the analysis missed something the PR description covers?

Capture this PR into my fork

Download a Markdown prompt that tells Claude how to port every commit in this PR into your working tree. Run it via claude -p < capture-pull-2.md from inside the repo you want the changes in.

⬇ Download capture-pull-2.md