Hallucination-probe scores wired into chat and tabular review

mglynnhenley added per-token hallucination scoring to both chat turns and tabular cells, routing completed assistant output through an external Modal-hosted probe service and streaming the scores back over the existing SSE channel. The whole integration sits behind a single env var and degrades gracefully when it's absent.

chat-uicontract-review

The probe client lives in backend/src/lib/probe/scorer.ts. When PROBE_API_URL is set, it takes the completed assistant message, constructs a prefilled conversation, and sends it to an OpenAI-chat-completions-compatible endpoint with include_scores: true. The response comes back as a plain non-streaming call that returns response.scores - a { probeName: number[] } object, one float per token. An earlier iteration streamed with max_tokens: 1 and read scores off each chunk; that was replaced with a single non-streaming call and the timeout bumped from 60s to 240s. The resulting arrays fan out token by token through an onScore callback, so the UI animation still works.

On the storage side, one migration (001_probe_scores.sql) adds probe_scores jsonb and probe_status text to tabular_cells and probe_scores jsonb to chat_messages. The base schema (000_one_shot_schema.sql) was kept vanilla after PR review reverted an early attempt to put the columns there.

A content_done SSE event was added to split the typing indicator from the probe animation. The browser drops the spinner as soon as Claude's text finishes; probe tints then fade in token by token as score events arrive. The UI switched from a separate heat-strip and badge (ProbeBadge.tsx, HighlightedSummary.tsx) to inline background tints over the assistant text, with a localStorage-persisted threshold slider (useProbeThreshold.ts, default 0.3) to hide low-confidence tokens. There is also a bundled /mock-probe route behind ENABLE_PROBE_MOCK=true so you can work locally without the Modal service.

One loose end: the scorer uses as unknown as type assertions at two call sites to work around the fact that include_scores and the top-level scores field are not in the standard OpenAI schema. Anyone porting this needs to reproduce the same custom server contract or those casts will hide a silent mismatch.

So what Worth a look if you're already running or evaluating a compatible probe service, or if you want a concrete reference for any post-hoc confidence signal on top of an existing SSE chat stream. The `content_done` event split and the env-gated null-returning client are both reusable patterns regardless of whether you want the probe feature itself. Skip if you have no probe endpoint - the core patterns are clean but the feature only runs against a specific custom server contract.

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

5 commits from mglynnhenley/mikehasprobes, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`e7549126`	Add hallucination-probe scoring across chat + tabular review	Matilda	2026-05-06	↗ GitHub
commit body Wire Mike to a Modal-hosted, OpenAI-compatible probe service. After each Claude/Gemini response, send the completion as a prefilled assistant turn to the probe and stream per-token scores onto the existing SSE channel. Persist scores on `chat_messages.probe_scores` and `tabular_cells.probe_scores`. UI fades a heat-strip + risk badge under cells/messages as scores arrive. Also: local mock probe at /mock-probe for development without the Modal service, and a "Think" toggle on the chat input so users can opt into adaptive thinking per turn (off by default - Sonnet 4.6 was rejecting the unconditional flag). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`ef3d78ce`	Keep 000 base schema clean; probe columns live in 001/002	Matilda	2026-05-06	↗ GitHub
commit body Per review: the one-shot base schema should stay vanilla. Probe score columns are additive and belong only in 001_probe_scores.sql (tabular_cells) and 002_chat_probe_scores.sql (chat_messages), which already exist as incremental migrations. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`4dcbc056`	Consolidate probe migrations into single 001	Matilda	2026-05-06	↗ GitHub
Merge 002's chat_messages.probe_scores into 001 alongside the tabular_cells columns. One migration covers the entire probe schema extension; 002 deleted. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`d0df46d4`	Merge pull request #1 from mglynnhenley/probe-scorer	Matilda	2026-05-06	↗ GitHub
Hallucination-probe scoring for chat + tabular review
`9206a2a0`	Add inline probe highlighting and threshold slider	Matilda	2026-05-07	↗ GitHub
commit body - Render probe scores as inline background tints over the assistant text instead of a separate heat strip - Add per-user highlight threshold slider (localStorage-persisted) so low-confidence tokens can be hidden from the chat view - Switch probe scorer to non-streaming chat completions; reads scores from top-level response.scores, drops the -analyze model variant - Emit content_done SSE event so the typing indicator drops as soon as Claude finishes, while probe scores keep streaming in Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

SHA

Subject

Author

Date

e7549126

Add hallucination-probe scoring across chat + tabular review

Matilda

2026-05-06

↗ GitHub

commit body

Wire Mike to a Modal-hosted, OpenAI-compatible probe service. After
each Claude/Gemini response, send the completion as a prefilled
assistant turn to the probe and stream per-token scores onto the
existing SSE channel. Persist scores on `chat_messages.probe_scores`
and `tabular_cells.probe_scores`. UI fades a heat-strip + risk badge
under cells/messages as scores arrive.

Also: local mock probe at /mock-probe for development without the
Modal service, and a "Think" toggle on the chat input so users can
opt into adaptive thinking per turn (off by default - Sonnet 4.6 was
rejecting the unconditional flag).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ef3d78ce

Keep 000 base schema clean; probe columns live in 001/002

Matilda

2026-05-06

↗ GitHub

commit body

Per review: the one-shot base schema should stay vanilla. Probe
score columns are additive and belong only in 001_probe_scores.sql
(tabular_cells) and 002_chat_probe_scores.sql (chat_messages),
which already exist as incremental migrations.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

4dcbc056

Consolidate probe migrations into single 001

Matilda

2026-05-06

↗ GitHub

Merge 002's chat_messages.probe_scores into 001 alongside the
tabular_cells columns. One migration covers the entire probe schema
extension; 002 deleted.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

d0df46d4

Merge pull request #1 from mglynnhenley/probe-scorer

Matilda

2026-05-06

↗ GitHub

Hallucination-probe scoring for chat + tabular review

9206a2a0

Add inline probe highlighting and threshold slider

Matilda

2026-05-07

↗ GitHub

commit body

- Render probe scores as inline background tints over the assistant
  text instead of a separate heat strip
- Add per-user highlight threshold slider (localStorage-persisted) so
  low-confidence tokens can be hidden from the chat view
- Switch probe scorer to non-streaming chat completions; reads scores
  from top-level response.scores, drops the -analyze model variant
- Emit content_done SSE event so the typing indicator drops as soon
  as Claude finishes, while probe scores keep streaming in

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-96.md from inside the repo you want the changes in.