nwhitehouse rebuilds legal research as a five-stage pipeline

Instead of one lookup and an answer, nwhitehouse's fork now runs a structured research loop across legal and web sources, automatically.

searchworkflow

When a user turns on legal or web sources, the fork stops doing a single search and instead kicks off a multi-step process behind the scenes. It rewrites the question into a handful of sharper sub-queries, searches several public legal databases at once - court opinions, federal regulations, the Federal Register and the like, plus general web results - ranks what comes back, then writes a short tailored summary of each top result before composing a final answer. The interesting bet is that extra step: the model digests each source into a clean two-or-three-sentence extract before synthesising, trading more model calls for higher-quality inputs and, in theory, fewer errors. To keep that from running away, nwhitehouse added hard limits on both how many calls and how many seconds a single research run can consume; if it hits the ceiling, it returns what it has rather than failing. Users can also expand a panel to watch the model's reasoning.

So what Worth a look if deep, multi-source legal research is your use case - less so if you want fast, simple single-shot answers.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

1 commit from nwhitehouse/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date

a4adcbf3 [feat-005] Multi-pass research orchestrator + UI integration Nick Whitehouse 2026-05-04 ↗ GitHub

SHA	Subject	Author	Date
`a4adcbf3`	[feat-005] Multi-pass research orchestrator + UI integration	Nick Whitehouse	2026-05-04	↗ GitHub
commit body Trigger: chat.ts auto-routes to the orchestrator when the user has any research source selected (sources.legal non-empty OR sources.web=true). Five-pass pipeline with budget enforcement (≤25 Olava calls, ≤45s wall), designed fresh for Olava-001 (Qwen3.6 + LoRA) per the SLM-cost-advantage strategy. work___'s services/{orchestrator,sub_agent,loop_controller}.py served as architectural reference, not a port. Backend (new) - backend/src/lib/research/types.ts - shared event/result types - backend/src/lib/research/budget.ts - call counter + wall-clock cap - backend/src/lib/research/queryExpander.ts - pass 1: 1 Olava call → 3-6 specialised queries - backend/src/lib/research/searchFanOut.ts - pass 2: parallel legal/web searches, dedupe by URL - backend/src/lib/research/triage.ts - pass 3: 1 Olava call → top-N most relevant - backend/src/lib/research/extractor.ts - pass 4: N parallel Olava calls → tailored extracts - backend/src/lib/research/synthesizer.ts - pass 5: streaming Olava call → markdown answer - backend/src/lib/research/orchestrator.ts - pipeline coordinator + SSE event emission Backend (modified) - routes/chat.ts: auto-detect research mode and route to runResearchOrchestrator - lib/llm/olava.ts: forward delta.reasoning(_content) via onReasoningDelta so the UI shows a live "Thinking..." indicator during long Olava think-times (rather than dead air). Persisted as part of chat_messages - same scope as the response itself. Frontend (new) - src/components/chat/onit-status-icon.tsx - Drift Grid 3×3 ripple loader (ported from work___ UI System UPGRADE/loaders.jsx) cross-fades to the Onit O logo (#00112c) when streaming stops. Replaces MikeIcon in the assistant ResponseStatus. - globals.css: gridPulse keyframe. Frontend (modified) - AssistantMessage.tsx - ResearchStepBlock: dot + label + meta detail (e.g. "top 5 selected"), rendered inline inside the existing PreResponseWrapper alongside other tool chatter. Wrapper auto-collapses to "Completed in N steps" once synthesis content arrives. - ReferenceBlock: numbered (1. 2. 3.) and indented under "Ranked results", with a continuous parent-x vertical line drawing through them so they visually nest under the parent step. No per-reference dots or in-between connectors. Click opens URL in new tab. - bug-003 obsoleted: reference_added events now flow through the wrapper again (single-pass mode no longer fires search tools - that's all research-mode now where synthesis is reliably non-empty). - Empty/whitespace-only content events no longer split wrappers (Olava sometimes emits "\n\n" between reasoning blocks; without this guard the UI breaks into two separate "Completed in N steps" cards). - shared/types.ts: AssistantEvent extended with research_step + sources.web. - hooks/useAssistantChat.ts: research_step SSE handler dedupes by key in the events array (so reload doesn't show running+done duplicates of the same step). Transient research.* events are kept out of persistence; research_step drives the UI. Process - backlog.md: full feat-005 design + feat-006 (citation reliability via add_citation tool - defer per testing notes 2026-05-04) appended. Smoke-tested end-to-end against real Olava + Brave + CourtListener with "What is the latest court case where a lawyer had misused AI in court?": ~30s wall, 16 events, 2KB synthesis with inline [Title](URL) citations to the April 2026 Sullivan & Cromwell case (vs single-pass Olava emitting just "\n\n" with the same prompt). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

commit body

Trigger: chat.ts auto-routes to the orchestrator when the user has any
research source selected (sources.legal non-empty OR sources.web=true).
Five-pass pipeline with budget enforcement (≤25 Olava calls, ≤45s wall),
designed fresh for Olava-001 (Qwen3.6 + LoRA) per the SLM-cost-advantage
strategy. work___'s services/{orchestrator,sub_agent,loop_controller}.py
served as architectural reference, not a port.

Backend (new)
- backend/src/lib/research/types.ts - shared event/result types
- backend/src/lib/research/budget.ts - call counter + wall-clock cap
- backend/src/lib/research/queryExpander.ts - pass 1: 1 Olava call → 3-6 specialised queries
- backend/src/lib/research/searchFanOut.ts - pass 2: parallel legal/web searches, dedupe by URL
- backend/src/lib/research/triage.ts - pass 3: 1 Olava call → top-N most relevant
- backend/src/lib/research/extractor.ts - pass 4: N parallel Olava calls → tailored extracts
- backend/src/lib/research/synthesizer.ts - pass 5: streaming Olava call → markdown answer
- backend/src/lib/research/orchestrator.ts - pipeline coordinator + SSE event emission

Backend (modified)
- routes/chat.ts: auto-detect research mode and route to runResearchOrchestrator
- lib/llm/olava.ts: forward delta.reasoning(_content) via onReasoningDelta
so the UI shows a live "Thinking..." indicator during long Olava think-times
(rather than dead air). Persisted as part of chat_messages - same scope as
the response itself.

Frontend (new)
- src/components/chat/onit-status-icon.tsx - Drift Grid 3×3 ripple loader
(ported from work___ UI System UPGRADE/loaders.jsx) cross-fades to the
Onit O logo (#00112c) when streaming stops. Replaces MikeIcon in the
assistant ResponseStatus.
- globals.css: gridPulse keyframe.

Frontend (modified)
- AssistantMessage.tsx
- ResearchStepBlock: dot + label + meta detail (e.g. "top 5 selected"),
rendered inline inside the existing PreResponseWrapper alongside other
tool chatter. Wrapper auto-collapses to "Completed in N steps" once
synthesis content arrives.
- ReferenceBlock: numbered (1. 2. 3.) and indented under "Ranked results",
with a continuous parent-x vertical line drawing through them so they
visually nest under the parent step. No per-reference dots or in-between
connectors. Click opens URL in new tab.
- bug-003 obsoleted: reference_added events now flow through the wrapper
again (single-pass mode no longer fires search tools - that's all
research-mode now where synthesis is reliably non-empty).
- Empty/whitespace-only content events no longer split wrappers (Olava
sometimes emits "\n\n" between reasoning blocks; without this guard the
UI breaks into two separate "Completed in N steps" cards).
- shared/types.ts: AssistantEvent extended with research_step + sources.web.
- hooks/useAssistantChat.ts: research_step SSE handler dedupes by key in
the events array (so reload doesn't show running+done duplicates of the
same step). Transient research.* events are kept out of persistence;
research_step drives the UI.

Process
- backlog.md: full feat-005 design + feat-006 (citation reliability via
add_citation tool - defer per testing notes 2026-05-04) appended.

Smoke-tested end-to-end against real Olava + Brave + CourtListener with
"What is the latest court case where a lawyer had misused AI in court?":
~30s wall, 16 events, 2KB synthesis with inline [Title](URL) citations to
the April 2026 Sullivan & Cromwell case (vs single-pass Olava emitting
just "\n\n" with the same prompt).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-124.md from inside the repo you want the changes in.

⬇ Download capture-thread-124.md