nwhitehouse rebuilds legal research as a five-stage pipeline

Instead of one lookup and an answer, nwhitehouse's fork now runs a structured research loop across legal and web sources, automatically.

searchworkflow

When a user turns on legal or web sources, the fork stops doing a single search and instead kicks off a multi-step process behind the scenes. It rewrites the question into a handful of sharper sub-queries, searches several public legal databases at once - court opinions, federal regulations, the Federal Register and the like, plus general web results - ranks what comes back, then writes a short tailored summary of each top result before composing a final answer. The interesting bet is that extra step: the model digests each source into a clean two-or-three-sentence extract before synthesising, trading more model calls for higher-quality inputs and, in theory, fewer errors. To keep that from running away, nwhitehouse added hard limits on both how many calls and how many seconds a single research run can consume; if it hits the ceiling, it returns what it has rather than failing. Users can also expand a panel to watch the model's reasoning.

So what Worth a look if deep, multi-source legal research is your use case - less so if you want fast, simple single-shot answers.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

1 commit from nwhitehouse/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
a4adcbf3 [feat-005] Multi-pass research orchestrator + UI integration Nick Whitehouse 2026-05-04 ↗ GitHub
commit body
Trigger: chat.ts auto-routes to the orchestrator when the user has any
research source selected (sources.legal non-empty OR sources.web=true).
Five-pass pipeline with budget enforcement (≤25 Olava calls, ≤45s wall),
designed fresh for Olava-001 (Qwen3.6 + LoRA) per the SLM-cost-advantage
strategy. work___'s services/{orchestrator,sub_agent,loop_controller}.py
served as architectural reference, not a port.

Backend (new)
- backend/src/lib/research/types.ts        - shared event/result types
- backend/src/lib/research/budget.ts       - call counter + wall-clock cap
- backend/src/lib/research/queryExpander.ts - pass 1: 1 Olava call → 3-6 specialised queries
- backend/src/lib/research/searchFanOut.ts - pass 2: parallel legal/web searches, dedupe by URL
- backend/src/lib/research/triage.ts        - pass 3: 1 Olava call → top-N most relevant
- backend/src/lib/research/extractor.ts     - pass 4: N parallel Olava calls → tailored extracts
- backend/src/lib/research/synthesizer.ts   - pass 5: streaming Olava call → markdown answer
- backend/src/lib/research/orchestrator.ts  - pipeline coordinator + SSE event emission

Backend (modified)
- routes/chat.ts: auto-detect research mode and route to runResearchOrchestrator
- lib/llm/olava.ts: forward delta.reasoning(_content) via onReasoningDelta
  so the UI shows a live "Thinking..." indicator during long Olava think-times
  (rather than dead air). Persisted as part of chat_messages - same scope as
  the response itself.

Frontend (new)
- src/components/chat/onit-status-icon.tsx - Drift Grid 3×3 ripple loader
  (ported from work___ UI System UPGRADE/loaders.jsx) cross-fades to the
  Onit O logo (#00112c) when streaming stops. Replaces MikeIcon in the
  assistant ResponseStatus.
- globals.css: gridPulse keyframe.

Frontend (modified)
- AssistantMessage.tsx
  - ResearchStepBlock: dot + label + meta detail (e.g. "top 5 selected"),
    rendered inline inside the existing PreResponseWrapper alongside other
    tool chatter. Wrapper auto-collapses to "Completed in N steps" once
    synthesis content arrives.
  - ReferenceBlock: numbered (1. 2. 3.) and indented under "Ranked results",
    with a continuous parent-x vertical line drawing through them so they
    visually nest under the parent step. No per-reference dots or in-between
    connectors. Click opens URL in new tab.
  - bug-003 obsoleted: reference_added events now flow through the wrapper
    again (single-pass mode no longer fires search tools - that's all
    research-mode now where synthesis is reliably non-empty).
  - Empty/whitespace-only content events no longer split wrappers (Olava
    sometimes emits "\n\n" between reasoning blocks; without this guard the
    UI breaks into two separate "Completed in N steps" cards).
- shared/types.ts: AssistantEvent extended with research_step + sources.web.
- hooks/useAssistantChat.ts: research_step SSE handler dedupes by key in
  the events array (so reload doesn't show running+done duplicates of the
  same step). Transient research.* events are kept out of persistence;
  research_step drives the UI.

Process
- backlog.md: full feat-005 design + feat-006 (citation reliability via
  add_citation tool - defer per testing notes 2026-05-04) appended.

Smoke-tested end-to-end against real Olava + Brave + CourtListener with
"What is the latest court case where a lawyer had misused AI in court?":
~30s wall, 16 events, 2KB synthesis with inline [Title](URL) citations to
the April 2026 Sullivan & Cromwell case (vs single-pass Olava emitting
just "\n\n" with the same prompt).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-124.md from inside the repo you want the changes in.

⬇ Download capture-thread-124.md