Custos turns tabular review cells into an auditable, multi-model bake-off
Every cell in a tabular review now remembers what produced it - and lets you re-run the same question against a different model on the spot.
On Custos's fork, the tabular review grid is no longer a one-shot overwrite. Each cell keeps a history of every value it has held, along with the model and prompts that produced it. Reviewers can re-run a selected batch of rows rather than regenerating the whole table, and open a per-cell history panel to see exactly how an answer evolved.
The more interesting move is the per-cell model playground. From inside a single cell, a reviewer can fire the same question at Claude, Gemini or Grok and stack the answers side by side in the history - useful when one model is hedging and you want a second opinion without rebuilding the whole review.
Spotted something wrong? Or know the PR text has fresher detail than the writeup above?