jpbreda adds vLLM as a third LLM provider alongside Claude and Gemini

This fork wires a self-hosted vLLM endpoint into Mike's provider system, letting operators swap the model layer to a local inference server instead of routing all requests through Anthropic or Google. The PR closed without merging upstream on May 10.

infrastructuresecurity

The implementation uses the openai npm package (v4.87.3) to talk to any OpenAI-compatible endpoint - vLLM exposes this interface by default. Three env vars control the integration: VLLM_BASE_URL, VLLM_API_KEY, and two model name vars (VLLM_MAIN_MODEL, VLLM_LIGHT_MODEL) for the main and lightweight task variants. Adding a new model to the picker is then a config change, not a code change.

The frontend model selector gains a "LocalLLM" group with main and lite options. The availability check for this provider returns true unconditionally - the server-side config determines whether it actually works, not a per-user credential. That's a reasonable design for a self-hosted deployment, but it means the LocalLLM options appear in the picker even on instances where VLLM_BASE_URL is unset. Users on those instances will see the options and get backend failures if they select them. You'd want to guard on the env var at startup or return availability false when the base URL is missing.

jpbreda noted testing against a personal vLLM endpoint with document generation verified end-to-end. The diff is large (+498/-52 across 16 files) but most of that is the lockfile entries for the openai SDK's transitive dependencies.

So what Worth a look if you want to point Mike at a self-hosted model. The OpenAI-compatible adapter pattern is clean and the env-var config is straightforward to follow. Before adopting, fix the always-true availability check - guard on `VLLM_BASE_URL` being set so unconfigured instances don't surface broken options. The PR was rejected upstream, so you'd be carrying this as a fork-only patch.

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

3 commits from jpbreda/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`0c84ef49`	feat: Add LocalLLM (vLLM) provider support	Joseph Breda	2026-05-02	↗ GitHub
commit body - Add OpenAI-compatible LLM provider for local vLLM endpoints - Support for configurable model names via environment variables - Add LocalLLM Main and LocalLLM Lite as default models - Update model selector to include LocalLLM options - Fix generate_docx title fallback for missing parameters - Add LibreOffice dependency note for document conversion
`cc951105`	feat: Replace unsloth model with placeholder in .env.example	Joseph Breda	2026-05-04	↗ GitHub
`fe3fd823`	docs: Add LLM configuration options to README	Joseph Breda	2026-05-04	↗ GitHub

SHA

Subject

Author

Date

0c84ef49

feat: Add LocalLLM (vLLM) provider support

Joseph Breda

2026-05-02

↗ GitHub

commit body

- Add OpenAI-compatible LLM provider for local vLLM endpoints
- Support for configurable model names via environment variables
- Add LocalLLM Main and LocalLLM Lite as default models
- Update model selector to include LocalLLM options
- Fix generate_docx title fallback for missing parameters
- Add LibreOffice dependency note for document conversion

cc951105

feat: Replace unsloth model with placeholder in .env.example

Joseph Breda

2026-05-04

↗ GitHub

fe3fd823

docs: Add LLM configuration options to README

Joseph Breda

2026-05-04

↗ GitHub

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-313.md from inside the repo you want the changes in.