nwhitehouse tries a sturdier citation pipeline, then walks it back

An experiment to make AI citations more reliable on a small in-house model hit a wall - but the surrounding polish stuck.

searchchat-ui

nwhitehouse's fork runs on a tiny tuned model, and one persistent annoyance is that when the AI cites a source, the citation markers don't always render properly. The team tried a sturdier mechanism - instead of asking the model to spit out citations as free-form text, give it a dedicated "tool" it has to call for each citation, on the theory that small models follow structured tool calls more reliably than free-form formatting.

It didn't work. The model happily wrote the citation numbers in its answer but skipped the tool call, so nothing rendered. Forty minutes later the change was reverted to the original approach. What survived: a nicer hover preview on citations (filename, page, a serif pull-quote) and a fix for a small but irritating bug where clicking a second citation in an already-open document didn't scroll to the new spot.

So what A useful negative result for anyone running legal AI on small in-house models - "just use tool calls" isn't a free reliability win; it depends on what the model was trained to do.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

2 commits from nwhitehouse/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
6321e28a [feat-006] add_citation tool + hover popover + same-doc rescroll fix Nick Whitehouse 2026-05-04 ↗ GitHub
commit body
Replaces the freeform <CITATIONS> JSON block with an explicit add_citation
tool the model invokes per [N] marker. Tool calls are far more reliable
on Olava than freeform output formats, mirroring the SLM-friendly pattern
established by feat-005's multi-pass orchestrator. Legacy block parsing
remains as a fallback so any model regression still surfaces citations.

Frontend: replaces the browser-native title= tooltip with a styled hover
popover (filename + page + serif quote). Fixes a same-doc rescroll bug
where clicking citation #2 on an already-open doc tab kept the viewer
on citation #1 - upsertTab now drops the prior initialScrollTop when
the new mode has its own scroll target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8731d95e [feat-006] Park add_citation tool path; restore <CITATIONS> JSON prompt Nick Whitehouse 2026-05-04 ↗ GitHub
commit body
Empirically the add_citation tool route was unreliable on Olava - model
wrote [N] markers but skipped the tool call, so no citations rendered.
Reverting the system prompt to the original <CITATIONS> JSON block format
restores the proven path. The add_citation tool stays defined and
dispatched (events still emit if called), and collectTurnCitations still
prefers tool-emitted citations when present - so re-engaging this path
later is just a prompt change.

Defensive parser improvements kept: page schema simplified to a string
(was oneOf, which not all tool-call parsers handle cleanly), and
normalizeCitation now coerces string-of-digits markers to integers.
Both make the legacy <CITATIONS> parser more robust to model output
variation as well.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-107.md from inside the repo you want the changes in.

⬇ Download capture-thread-107.md