Archibald312 makes spreadsheets first-class in GordonOSS

A finance-focused fork of Mike now ingests Excel and CSV files end-to-end, with the AI citing answers down to the exact cell.

discoverysearch

Most legal-AI tools treat spreadsheets as an afterthought - flatten them to text, lose the structure, and hope the model figures it out. Archibald312 took the opposite path. GordonOSS now reads .xlsx, .xls, .xlsm, and .csv files directly, preserves formulas and merged ranges, and renders them in a dedicated viewer with sheet tabs, a formula bar, and column/row headers that behave the way a finance professional expects.

The payoff is in the citations. When the AI references a number, it points to a specific cell on a specific sheet - and clicking the citation jumps you straight there with a highlight. Export an answer to Excel and the citations come back as cell comments, closing the loop. Archibald312 also swapped the default model to Gemma 4 31B for its more generous free-tier quota, and removed an upstream guard that had been blocking real documents from getting through.

So what Anyone using Mike for financial diligence, audit work, or numbers-heavy contract review should look at this - it's the first fork to treat spreadsheets as something more than flat text.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

1 commit from Archibald312/GordonOSS, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
2ac696ce Phase 5: Excel I/O - xlsx/xls/xlsm/csv ingestion + per-cell citations + XlsxView Scott Rozen 2026-05-15 ↗ GitHub
commit body
Spreadsheets are now first-class documents in GordonOSS.

## Backend
- Born `backend/src/lib/extractors/` per CLAUDE.md deterministic-first rule:
  `xlsx.ts` (ExcelJS, numfmt-formatted values, formula preservation, merged
  ranges) and `csv.ts` (RFC 4180 handwritten parser) - both pure/side-effect-free
  with 10 new unit tests.
- `documents.ts`: accept xlsx/xls/xlsm/csv; xls→xlsx normalization via
  libreoffice-convert; spreadsheets skip PDF conversion; structure tree
  lists sheet names.
- `convert.ts`: `xlsToXlsx()` helper.
- `documentReading.ts`: xlsx/csv branch calls extractor+flattener; citation
  reminder appended with spreadsheet cell-address guidance when file_type is a
  spreadsheet.
- `chatTools.ts`: system prompt extended with spreadsheet citation form;
  `normalizeCitation` preserves `Sheet!Cell` strings in the `page` field.
- `models.ts`: Gemma 4 31B added as default (higher free-tier quota than
  Gemini Flash); `providerForModel` routes `gemma-*` through Gemini adapter.
- Removed `freeTierGuard.ts` and its test - guard was blocking real documents
  from free-tier Gemini. Data-privacy tier guard redesign deferred to CLAUDE.md
  "Future capabilities".
- Chat error routes now surface real `err.message` in dev instead of generic
  "Stream error".

## Frontend
- `XlsxView.tsx` (new): sheet tabs, sticky column-letter header + row-number
  gutter, read-only formula bar (cell address chip + formula/value), numfmt-
  formatted display, click-to-select, citation jump + 2.5s yellow highlight.
- `DocPanel.tsx`, `DocViewModal.tsx`, chat `page.tsx`: route xlsx/csv to
  XlsxView ahead of DocxView/DocView.
- `types.ts`: `CitationQuote.cellRef`; `expandCitationToEntries` routes
  Sheet!Cell strings; `formatCitationPage` shows cell ref verbatim.
- `exportToExcel.ts`: per-cell ExcelJS comments containing citation list.
- Upload `accept` extended to xlsx/xls/xlsm/csv in all five upload sites.
- `ModelToggle.tsx`: Gemma 4 31B added at top of Google group; set as default.
- `DocxView.tsx`: childNodes crash demoted to warn + inline fallback.
- `CLAUDE.md`: editable formula bar, generate_xlsx tool, data-privacy tier
  guard added as future capabilities.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-424.md from inside the repo you want the changes in.

⬇ Download capture-thread-424.md