Use OpenAI SDK for Responses API streaming

✅ merged · #1 · mwcyu/Mike-fork ← mwcyu/Mike-fork · opened 8d ago by mwcyu · merged 8d ago by mwcyu · self · +3,383-311 across 24 files · ↗ on GitHub

From the PR description

Replace the hand-rolled fetch + SSE parsing in the OpenAI adapter with the official openai package. client.responses.create({ stream: true }) returns a typed async-iterable of events, eliminating the manual TextDecoder/buffer and extractSseJson machinery. Behavior, exports, and the tool-call loop are unchanged.

Our analysis

Vitest + supertest route integration suite — read the full analysis →

Native provider web-search in the main chat — read the full analysis →

Think the analysis missed something the PR description covers?

Commits in this PR (4)

SHA Subject Author Date
e688ccbc Use OpenAI SDK for Responses API streaming Claude 2026-05-31 ↗ GitHub
commit body
Replace the hand-rolled fetch + SSE parsing in the OpenAI adapter with the
official openai package. client.responses.create({ stream: true }) returns a
typed async-iterable of events, eliminating the manual TextDecoder/buffer and
extractSseJson machinery. Behavior, exports, and the tool-call loop are
unchanged.
9dfb6fd5 Extract shared streaming loop into a provider-agnostic driver Claude 2026-05-31 ↗ GitHub
commit body
The Claude, OpenAI, and Gemini adapters each re-implemented the same
agentic streaming loop (iterate to maxIterations, stream a turn,
accumulate fullText, run tools, feed results back). That triplication
let the loop logic drift between providers.

Introduce backend/src/lib/llm/driver.ts owning the loop, break
conditions, the runTools call, and the single fullText accumulation.
Each provider becomes a thin session factory (create{Claude,OpenAI,
Gemini}Session) that owns its SDK call, event parsing, follow-up
message state, and all callback firing. Public stream* signatures are
unchanged, so callers stay untouched.

Behavior is preserved: per-provider callback ordering, the OpenAI
pre-tool preamble drop, Claude's stop_reason hard-stop, Gemini's
verbatim thoughtSignature replay, and OpenAI instructions-on-iter-0.
Chat history persistence (route-level, fed by callbacks + fullText) is
unaffected. Typecheck passes.
0ad1b344 Add native web-search tool to all three LLM adapters Claude 2026-05-31 ↗ GitHub
commit body
Enable each provider's built-in, server-executed web search so the main
chat can browse the internet:
- Claude: native web_search_20250305 server tool
- OpenAI Responses: web_search built-in tool
- Gemini: googleSearch grounding tool

These run server-side (the provider performs the search and folds
results into its answer), so they bypass the runTools function-call
loop. A new enableWebSearch flag on StreamChatParams gates them; only
the interactive main chat (runLLMStream) opts in, leaving tabular
review and bulk extraction untouched.

Surface searches to the UI via a new onWebSearch callback that each
adapter fires when a search starts - Claude on the server_tool_use
contentBlock, OpenAI on the web_search_call output item, Gemini on
groundingMetadata.webSearchQueries. runLLMStream streams a web_search
event and persists it to the assistant message (unlike the transient
tool_call_start), so reloaded chats still show "Searched the web for
...". Added the web_search variant to the backend and frontend
AssistantEvent unions, a stream handler, and a WebSearchBlock renderer.

Backend and frontend typecheck clean.
a101800d test(backend): add Vitest route integration tests for all 8 routers Claude 2026-06-01 ↗ GitHub
commit body
Introduce a supertest-based integration suite covering user, chat,
projectChat, projects, documents, tabular, workflows and downloads
routes (77 tests). Supabase, auth, storage and the LLM/chat tooling are
mocked; streaming SSE endpoints and the signed-download token round-trip
are exercised end to end.

Refactor src/index.ts to export createApp() and only bind a port when run
directly, and skip rate limiters under NODE_ENV=test so the suite is
deterministic. Add vitest + supertest dev deps and test scripts.

Capture this PR into my fork

Download a Markdown prompt that tells Claude how to port every commit in this PR into your working tree. Run it via claude -p < capture-pull-1.md from inside the repo you want the changes in.

⬇ Download capture-pull-1.md