ecarjat/mike@74564426

feat: native PDF vision for scanned documents

↗ view on GitHub · Emmanuel Carjat · 2026-05-11 · 74564426

Add multimodal PDF processing so scanned PDFs (no text layer) can be
analysed by Gemini and Claude models instead of returning empty results.

- streamGeminiMultimodal: pass raw PDF bytes as inlineData to Gemini
- streamClaudeMultimodal: pass raw PDF bytes as document content block to Claude
- loadSourceTexts: store rawPdfBase64 when pdfjs extracts no text
- queryGeminiAllColumns: dispatch to vision path for Gemini/Claude;
  return a clear grey error cell for OpenAI (no native PDF support)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Repository	ecarjat/mike
Author	Emmanuel Carjat <emmanuel.carjat@quanthouse.com>
Authored	2026-05-11T15:28:21+02:00
Parents	`be1665ab`
Stats	4 files changed , +180 , -34
Part of	Native PDF vision for scanned documents (Gemini + Claude)

Repository

ecarjat/mike

Author

Emmanuel Carjat <emmanuel.carjat@quanthouse.com>

Authored

2026-05-11T15:28:21+02:00

Parents

be1665ab

Stats

4 files changed , +180 , -34

Part of

Native PDF vision for scanned documents (Gemini + Claude)

Capture this commit into my fork

Download a Markdown prompt that tells Claude how to port this exact commit into your working tree. Run it via claude -p < capture-commit-74564426.md from inside the repo you want the change in.

⬇ Download capture-commit-74564426.md