Olava follow-ups: regen budget, tool gating, main-chat option

↗ view on GitHub · Nick Whitehouse · 2026-04-30 · 3a2f397c

- completeOlavaText takes max(caller, OLAVA_MAX_TOKENS) so callers
  tuned for non-reasoning models (e.g. tabular regen passing 2048)
  don't undershoot the reasoning budget.
- Strip tools from the request body by default - vLLM rejects with
  HTTP 400 unless launched with --enable-auto-tool-choice. Set
  OLAVA_ENABLE_TOOLS=true to pass tools through when the server is
  configured for them.
- Olava is now offered in the main chat model dropdown alongside
  Anthropic and Google. ApiKeyMissingModal shows a server-config
  message for Olava (env vars) instead of pointing at account
  settings.
- Per-iteration log dumps the truncated response text to make
  diagnosing short / refusal responses straightforward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Repository nwhitehouse/mike
Author Nick Whitehouse <nick.whitehouse@mccarthyfinch.com>
Authored
Parents b04c4213
Stats 3 files changed , +37 , -6
Part of Olava (vLLM/Qwen) provider integration

Capture this commit into my fork

Download a Markdown prompt that tells Claude how to port this exact commit into your working tree. Run it via claude -p < capture-commit-3a2f397c.md from inside the repo you want the change in.

⬇ Download capture-commit-3a2f397c.md