nforum wires Mike up to a self-hosted AI backend
The fork can now run on a private inference server instead of (or alongside) the big cloud model vendors.
nforum has added support for vLLM, an open-source engine for hosting large language models on your own hardware. With a couple of environment variables, the fork will route prompts to a private endpoint rather than Anthropic, OpenAI, or Google. Two new model options appear in the picker, and a single provider module quietly handles both cloud and local backends through the same OpenAI-compatible plumbing.
The architecture is clean - one code path, two destinations - and it slots in alongside the existing cloud providers rather than replacing them. There is one wrinkle worth flagging for anyone borrowing the work: once a local server is configured, the fork silently makes it the default for everything, including the small model that names chat threads. That is a policy choice, not a technical one, and any downstream fork should decide it deliberately.
Spotted something wrong? Or know the PR text has fresher detail than the writeup above?