OpenAI-compatible local inference: Ollama and Qwen via env-configured base URL

punyaslokdutta makes Mike's OpenAI adapter point at any OpenAI-compatible endpoint - Ollama, vLLM, or the hosted service. Set `OPENAI_BASE_URL`, a model name, and an endpoint mode flag, and the same adapter talks to Ollama, vLLM, or any chat-completions-compatible runtime. A local runtime with no API key is treated as a valid provider, unblocking fully offline use.

infrastructuresecurity

The change is driven by three environment variables: a base URL, a model name, and an endpoint mode. When those are configured and no hosted OpenAI key is present, the local runtime counts as an available provider - the provider-availability check is adjusted to recognize this. That's the piece that makes offline mode actually work rather than just silently failing to load models.

Two Qwen 3 sizes (8B and 14B) are added to the model picker as concrete local options. The streaming path for the OpenAI-compatible route is also changed to deliver output incrementally. Without streaming, Mike would render a blank response while Ollama spent seconds generating; punyaslokdutta explicitly calls this out as a UX fix rather than a background optimization.

The branch documentation covers a Docker-plus-native-Ollama workflow for running the full stack locally end to end. Validation was done against a backend build and a manual rebuild of the local Docker stack with Supabase keys wired through the env files. The scope is deliberately narrow: local inference enablement only, with benchmark work left for a follow-on.

The PR is open against willchen96/mike and has not landed. The diff snapshot in this fork primarily shows the storage backend swap commit rather than the inference changes, so the local inference code may be on a separate branch from what's visible here.

So what Worth a look if you want to run Mike against a local Ollama instance or any OpenAI-compatible endpoint without touching the rest of the adapter code. The streaming fix for the OpenAI-compatible path is worth pulling regardless of whether you need local inference. Skip it if you're hosting only with Claude or Gemini and have no interest in self-hosted models.

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

2 commits from punyaslokdutta/HarveyOss, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`39cdf1ca`	chore: local setup - swap R2 for Supabase Storage, install frontend deps	Punyaslok Dutta	2026-05-09	↗ GitHub
commit body - storage.ts: replaced @aws-sdk/client-s3 + R2 with Supabase Storage (upload, download, delete, signed URLs all via @supabase/supabase-js) - .env.example: removed R2 vars, added STORAGE_BUCKET=mike - frontend/package-lock.json: updated after npm install --legacy-peer-deps Local setup recap: - Backend :3001, frontend :3000 - Supabase project: gbdfkvaigunfvrgurkwk (ap-northeast-1 Tokyo) - Storage bucket: mike (private, Supabase Storage) - DB schema applied via 000_one_shot_schema.sql - AI provider: Gemini - Secrets in .env / .env.local - gitignored, not committed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
`4e73a45d`	Merge remote-tracking branch 'origin/main'	Punyaslok Dutta	2026-05-09	↗ GitHub

SHA

Subject

Author

Date

39cdf1ca

chore: local setup - swap R2 for Supabase Storage, install frontend deps

Punyaslok Dutta

2026-05-09

↗ GitHub

commit body

- storage.ts: replaced @aws-sdk/client-s3 + R2 with Supabase Storage
  (upload, download, delete, signed URLs all via @supabase/supabase-js)
- .env.example: removed R2 vars, added STORAGE_BUCKET=mike
- frontend/package-lock.json: updated after npm install --legacy-peer-deps

Local setup recap:
- Backend :3001, frontend :3000
- Supabase project: gbdfkvaigunfvrgurkwk (ap-northeast-1 Tokyo)
- Storage bucket: mike (private, Supabase Storage)
- DB schema applied via 000_one_shot_schema.sql
- AI provider: Gemini
- Secrets in .env / .env.local - gitignored, not committed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

4e73a45d

Merge remote-tracking branch 'origin/main'

Punyaslok Dutta

2026-05-09

↗ GitHub

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-490.md from inside the repo you want the changes in.