feat: add redline-aware document extraction
From the PR description
Summary
- surface DOCX tracked changes and comment bubbles as inline markers for the assistant
- add optional PyMuPDF-based PDF redline extraction with pdfjs fallback
- teach chat and tabular prompts how to interpret insertions, deletions, moved text, and reviewer comments
- preserve Mammoth tabular DOCX extraction when no review markup is detected
Verification
- npm run build --prefix backend
- DOCX smoke test for insertion/deletion/comment markers using synthetic DOCX
- PDF smoke test for red/blue/green text markers using PyMuPDF-generated sample
Our analysis
Surface DOCX and PDF redlines as inline markers for the assistant — read the full analysis →
Think the analysis missed something the PR description covers?
Capture this PR into my fork
Download a Markdown prompt that tells Claude how to port every
commit in this PR into your working tree. Run it via
claude -p < capture-pull-1.md from
inside the repo you want the changes in.