nwhitehouse puts the AI's inner monologue on a diet
The fork now lets the underlying model think hard, think lightly, or skip thinking entirely - and hides the messy reasoning by default.
Reasoning models like the one nwhitehouse uses can spend thousands of words "thinking out loud" before answering. That's useful for a hard legal question, wasteful when the system is just rewriting a five-word search query in the background. This update adds three dials: a global thinking mode (off, low, or standard), tighter token budgets, and a per-task override so small internal calls - like query rewrites and triage - skip the reasoning step entirely.
On the user side, the long reasoning readout that used to dominate each assistant message is now tucked into a collapsed card. Click to expand if you want to audit how the model got there; otherwise the answer leads.
Spotted something wrong? Or know the PR text has fresher detail than the writeup above?