vLLM/LocalLLM provider via OpenAI-compatible client

nforum adds support for self-hosted vLLM endpoints alongside cloud providers. The conflict-resolving merge that united this with the OpenAI PR is the most consequential piece - it rewires defaults in a way that deserves a second look before import.

infrastructurepersonas

The initial commit (0c84ef4) introduces localllm-main and localllm-lite model ids backed by four env vars: VLLM_BASE_URL, VLLM_API_KEY, VLLM_MAIN_MODEL, and VLLM_LIGHT_MODEL. No per-user key - it's a server-configured endpoint. modelAvailability treats LocalLLM as always-available when VLLM_BASE_URL is set.

A follow-up commit scrubbed a specific model name (unsloth/gemma-4-E2B-it-GGUF:Q5_K_S) from .env.example, replacing it with a generic placeholder. Someone had left their own model identifier in.

The reconciling merge (86dab80) is what matters. It unifies the LocalLLM and OpenAI PRs into a single openai.ts with a getClient(model) factory: gpt-* ids get the OpenAI cloud client; localllm-* ids get a vLLM client with a different baseURL and VLLM_API_KEY fallback. Tool-call handling upgrades to a streaming-delta accumulator with full tool_calls round-trip. The post-merge defaults flip DEFAULT_MAIN_MODEL to "localllm-main" and resolveTitleModel silently prefers localllm-lite whenever VLLM_BASE_URL is present.

That default flip is a policy decision encoded in the code. Anyone deploying this with a VLLM_BASE_URL set will route all title generation through vLLM without any UI indication.

So what Worth a look if on-prem or private inference is a requirement. The shared `getClient` factory is a clean architecture. But pull selectively: the default-model flip and the silent vLLM title-model preference should be stripped or made configurable before deploying to real users.

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

4 commits from nforum/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`0c84ef49`	feat: Add LocalLLM (vLLM) provider support	Joseph Breda	2026-05-02	↗ GitHub
commit body - Add OpenAI-compatible LLM provider for local vLLM endpoints - Support for configurable model names via environment variables - Add LocalLLM Main and LocalLLM Lite as default models - Update model selector to include LocalLLM options - Fix generate_docx title fallback for missing parameters - Add LibreOffice dependency note for document conversion
`cc951105`	feat: Replace unsloth model with placeholder in .env.example	Joseph Breda	2026-05-04	↗ GitHub
`fe3fd823`	docs: Add LLM configuration options to README	Joseph Breda	2026-05-04	↗ GitHub
`86dab800`	merge: resolve PR #20 (vLLM/LocalLLM) conflicts with PR #16 (OpenAI)	Bojan Plese	2026-05-07	↗ GitHub
commit body Unified LLM provider architecture: - openai.ts: dual client factory (OpenAI cloud + vLLM local) via baseURL - models.ts: all 4 provider groups (LocalLLM, Anthropic, Google, OpenAI) - userSettings.ts: DB openai key with VLLM_API_KEY env fallback - ModelToggle.tsx: 4-group type union and GROUP_ORDER - modelAvailability.ts: LocalLLM always available (server-configured) - All frontend apiKeys: use profile.openaiApiKey from DB

SHA

Subject

Author

Date

0c84ef49

feat: Add LocalLLM (vLLM) provider support

Joseph Breda

2026-05-02

↗ GitHub

commit body

- Add OpenAI-compatible LLM provider for local vLLM endpoints
- Support for configurable model names via environment variables
- Add LocalLLM Main and LocalLLM Lite as default models
- Update model selector to include LocalLLM options
- Fix generate_docx title fallback for missing parameters
- Add LibreOffice dependency note for document conversion

cc951105

feat: Replace unsloth model with placeholder in .env.example

Joseph Breda

2026-05-04

↗ GitHub

fe3fd823

docs: Add LLM configuration options to README

Joseph Breda

2026-05-04

↗ GitHub

86dab800

merge: resolve PR #20 (vLLM/LocalLLM) conflicts with PR #16 (OpenAI)

Bojan Plese

2026-05-07

↗ GitHub

commit body

Unified LLM provider architecture:
- openai.ts: dual client factory (OpenAI cloud + vLLM local) via baseURL
- models.ts: all 4 provider groups (LocalLLM, Anthropic, Google, OpenAI)
- userSettings.ts: DB openai key with VLLM_API_KEY env fallback
- ModelToggle.tsx: 4-group type union and GROUP_ORDER
- modelAvailability.ts: LocalLLM always available (server-configured)
- All frontend apiKeys: use profile.openaiApiKey from DB

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-49.md from inside the repo you want the changes in.