mglynnhenley wires a hallucination probe into every AI answer
This fork scores the AI's confidence token by token, then shows users where to squint.
After Claude or Gemini drafts a response, mglynnhenley's fork quietly ships the draft to a separate scoring service that rates how likely each word is to be made up. Those scores get stored alongside the answer and streamed back to the screen, so cells in a review grid and replies in the chat light up as the verdicts come in.
The interface went through a quick iteration: an early version showed a separate risk badge and heat strip, but the team replaced that with subtle background tints painted directly over the AI's words, plus a slider so users can hide anything below a confidence threshold they choose. There's also a new opt-in 'Think' toggle that lets users ask the model to reason harder on a given turn.
Spotted something wrong? Or know the PR text has fresher detail than the writeup above?