Minnesota med-mal records platform: five phases from research validation to chronology view

rmerk/mike spent a single day on May 11, 2026 shipping a vertical slice that turns Mike into a Minnesota-jurisdiction-aware med-mal records pipeline. The work spans research validation, project templates, a ~6,900-LOC extraction pipeline, and a citation-anchored chronology view - all grounded in specific MN statutes and Supreme Court authority.

discoveryworkflow

Phase 0 (research validation) produced twenty substantive deltas across two plan documents before a line of code was written. The red-flag rules were five-of-five breach-only; there was no causation rule, no § 145.64 peer-review hard-refuse policy, no Plutshack supports_element tagging. The subfolder taxonomy needed four more folders. The extraction plan also inherited two factual errors about MN law: § 145.682(4)'s expert-affidavit clock runs from Rule 26.04(a) discovery commencement, and the prima facie test is Plutshack's three-element formulation, not a five-element informed-consent test. Phase 0 caught all of that before Phase 2's multi-day build.

Phase 1 adds one nullable column (projects.template_id), a code-only template registry, and a med-mal-case template that scaffolds twelve folders on project create. The POST /projects handler inserts subfolders in two passes, resolving parent references by array index, with CASCADE-on-failure atomicity. The frontend gets a template picker and an eleven-schema recommended-reviews strip covering chronology, bills-and-EOBs, MAR, transfusion log, vitals, labs, imaging index, red-flag scan, provider-defendant map, causation-chain, and expert-opinions (§ 145.682). Migration 0001 ran directly to prod because Supabase's free plan returns PaymentRequiredException on branch creation; that constraint is documented in CLAUDE.md and applies to any fork on a free-tier org.

Phase 2 is the main extraction pipeline: migrations 0002-0005, per-page Claude JSON extraction with raster/vision fallback for empty text layers, and a § 145.64 compliance gate in peerReviewVisionPrescan.ts that halts before any event extraction if peer-review markers appear on scanned pages. No bypass env var - the author is explicit. Six red-flag rules ship in redFlags.ts, each tagged to a Plutshack supports_element (duty / breach / causation / damages). Optional EXTRACTION_ASYNC_MODE=queue for serverless deployments. Vitest + Supertest cover 403/404/409 and malformed-PDF paths.

Phase 3 adds /projects/[id]/timeline/[docId] reading the Phase 2 event log via the existing events endpoint. Zero backend changes, zero LLM calls. The earlier Rail B plan - populate tabular cells from event SQL - was ruled out: the tabular_cells table is keyed one-per-doc, so per-encounter data can't be represented without collapsing it into a single cell and losing sort, filter, and per-row inspection. MAR, vitals, and labs are deferred to Phase 3.5 as separate Timeline-pattern surfaces.

Import calculus. The full stack is MN med-mal specific; red-flag rules and extraction defenses cite statutes by number. Porting to another jurisdiction means rewriting the rule library and column prompts. But three pieces port cleanly: the Phase 1 templates infrastructure (project templates + recommended-tabular-review registry) is domain-agnostic; the Phase 2 raster/vision fallback for scanned pages is reusable for any structured-extraction pipeline; the Phase 3 Timeline view works for any event-log source. Migration 0001 ran outside the normal branch flow and should be re-run cleanly in a fresh environment. Migration 0005 needs re-applying per environment with an unindexed-FK advisor check.

So what Worth a look for the Phase 1 templates machinery (generic across verticals) or the Phase 2 scanned-page raster/vision extraction pattern. Importing the full stack makes sense only if MN-jurisdiction med-mal is a core use case - the legal coupling is deep enough that jurisdiction-neutral adoption would require substantial rewrites of the rule library and column-prompt set.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

8 commits from rmerk/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`48994259`	docs: add med-mal project templates plan	Ryan Choi	2026-05-11	↗ GitHub
commit body Plan for a "Medical Malpractice Case" project template that scaffolds the standard subfolder structure on project creation and registers reusable tabular-review column schemas (chronology, bills, MAR, transfusion log, vitals, labs, imaging index, red-flag scan) the user can instantiate once documents are uploaded.
`12479217`	docs: add MN med-mal law research and update templates plan	Ryan Choi	2026-05-11	↗ GitHub
commit body Three new docs and a substantive revision to the templates plan, all captured in `docs/`: - `RESEARCH_mn_med_mal_law.md/.pdf` - validates the med-mal-case template against Minnesota primary sources (statutes, MN Supreme Court opinions, Rules of Civil Procedure). Corrects two inherited errors: § 145.682(4)'s 180-day clock runs from Rule 26.04(a) discovery commencement (not summons service), and the MN prima facie negligence test is the 3-element Plutshack/Smith v. Knowles formulation (not the 5-element informed-consent test). Ends with a 10-item "Plan deltas" appendix. - `RESEARCH_legal_rag.md` - surveys legal RAG approaches across case-law, contracts, e-discovery, and med-mal lanes; recommends structured extraction with page+bbox citations over vector RAG for med-mal records. - `PLAN_med_mal_extraction_pipeline.md` - Mike-specific build plan derived from the RAG research: page-by-page multimodal extraction → JSON event log in Postgres → deterministic red-flag rules → page-window retrieval for chat. Out-of-scope: embeddings, GraphRAG, cross-case search. - `PLAN_med_mal_templates.md` - applies all ten plan deltas: subfolder taxonomy expanded 8→12+nested (adds expert-affidavits-145682, hipaa-authorizations, discovery, collections-liens-subrogation, pre-retainer-investigation; renames bills→bills-and-eobs); three new tabular schemas (provider-defendant-map, causation-chain, expert-opinions-145682); bills columns split per § 548.251; red-flag scan gains § 145.682 affidavit link + temporal-proximity anchor; new "Cited authority" section consolidates statutes/rules/cases for future revision audits. v1 schema change stays at single `template_id` column; v1.1 follow-up note flags `key_dates jsonb null` for deadline tracking.
`3a973152`	docs: validate extraction-pipeline plan against MN-law research	Ryan Choi	2026-05-11	↗ GitHub
commit body Applies the same MN-law validation pass to PLAN_med_mal_extraction_pipeline.md that just landed on PLAN_med_mal_templates.md, surfacing ten substantive deltas plus an architectural tension between the event log and tabular review schemas. Net effect: the multi-day Phase 2 build no longer inherits the breach-only red-flag bias or the missing MN-specific privacy/work-product defenses that the original plan carried. Schema changes (folded into the inline SQL): - document_events gains provider_role (Plutshack role-specific SOC), episode_of_care (chronology clustering), privacy_class (§ 144.293 / § 145.64 / 42 CFR Part 2 segregation), and key_date_role (feeds Phase 4 key_dates jsonb deadline-tracking widget). medications jsonb shape expanded with ordered_at / administered_at / allergy_conflict_flag / weight_based_dose_check_passed for the Mulder rule (Reinhardt). - document_red_flags gains supports_element (Plutshack/Smith 4-cut) and awaits_expert_affidavit (closes loop with § 145.682(4)(a) checklist). - RLS clause added to filter peer_review_145_64 rows from default queries. Red-flag library rebalanced from 5 breach-only rules to 6 rules tagged across the duty/breach/causation framework, including a new temporal_anchor_causation rule that surfaces tight temporal anchors without asserting causation (Plutshack/Smith still require expert testimony). §Defenses expanded with four MN-specific policies: - § 145.64 peer-review hard-refuse: extraction halts entirely on peer-review-marked documents; no events written. This is the strictest policy in the codebase. - § 144.293 mental-health redaction-by-default until project-level toggle confirms the heightened authorization is on file. - Rule 408 settlement-comms caveat: extracted but narrative prefixed with [Rule 408 - inadmissible to prove liability]. - Rule 26(b)(3) work-product opt-out: consulting-expert notes NOT extracted by default; project-level opt-in only. §Out of scope grew with two permanent architectural separations: - Causation-chain reasoning reserved for builtin-causation-chain tabular review (Plutshack/Smith expert testimony requirement; Dickhoff loss-of-chance fact-intensity). - Provider-defendant entity resolution reserved for builtin-provider-defendant-map (Popovich two-factor test requires outside-the-record evidence - Rock v. Abdullah reliance-element limit). Adds a Plan deltas appendix and Cited authority section mirroring the research doc's format so future revisions can re-check against the same source set. PDF re-rendered via reconstructed /tmp/jenn-mike/md_to_pdf.py (markdown + weasyprint).
`73faac09`	feat(templates): med-mal-case project template	Ryan Choi	2026-05-11	↗ GitHub
commit body Phase 1 of the templates → extraction-pipeline path. Selecting a template when creating a project scaffolds the 12-folder med-mal taxonomy + nested imaging folder, and exposes a "Recommended tabular reviews" strip on the project page with one-click buttons for the 11 schemas validated against Minnesota primary sources (Plutshack/Smith, § 145.682, § 548.251, § 144.293, Popovich, Dickhoff, Flom, Reinhardt, Cornfeldt). Backend: - New table column: projects.template_id (text null) - applied directly to prod (`qkfcrsrtualqdmqqexpf`) via Supabase MCP apply_migration after branch creation returned PaymentRequiredException on the free plan. Folded into schema.sql + backend/migrations/0001_projects_template_id.sql. - backend/src/lib/templateIds.ts - shared string-literal unions for ProjectTemplateId and TabularSchemaId. - backend/src/lib/builtinProjectTemplates.ts - registry exporting the med-mal-case template with its 12 subfolders and 11 recommended schema ids. - backend/src/routes/projects.ts - POST /projects extended to parse optional template_id, validate against registry (400 on unknown), and batch-create subfolders in two passes (top-level UUIDs returned by Batch A resolve the array-index parent refs in Batch B). On any error the project row is deleted via CASCADE so the operation is atomic from the user's perspective. Frontend: - frontend/src/app/components/tabular/templateIds.ts - mirror of backend template/schema IDs. - frontend/src/app/components/tabular/builtinTabularSchemas.ts - the only place column specs for built-in schemas live (11 schemas, ~120 columns total) with prompts tuned for Mayo/Epic ebook formatting. Each schema's description names the prima facie element or filing requirement it supports. - frontend/src/app/components/projects/builtinProjectTemplates.ts - frontend mirror of the template registry (id, name, description, recommendedSchemaIds only; backend owns subfolders). - frontend/src/app/lib/mikeApi.ts - createProject signature gains optional template_id arg. - frontend/src/app/components/shared/types.ts - MikeProject gains template_id and subfolders fields. - frontend/src/app/components/projects/NewProjectModal.tsx - adds a Template dropdown between the CM number field and the attribute pills, using the same Check-icon-on-select pattern as AddNewTRModal's workflow picker. - frontend/src/app/components/projects/ProjectPage.tsx - on the reviews tab, renders a recommended-reviews strip when project.template_id is set. Each pill opens AddNewTRModal with the schema's title and full columns_config pre-populated. - frontend/src/app/components/tabular/AddNewTRModal.tsx - accepts new optional initialTitle and initialColumnsConfig props that the recommended strip uses to seed the modal. CLAUDE.md: documents the migrations workflow + flags the Supabase branching limitation on the free plan so future migrations don't reach for create_branch first.
`3d89162e`	docs: add 5-phase med-mal records platform roadmap	Ryan Choi	2026-05-11	↗ GitHub
commit body Closes the documentation gap surfaced after Phase 1 shipped: the per-phase plans (PLAN_med_mal_templates.md, PLAN_med_mal_extraction_pipeline.md) were already committed, but the meta-orchestration that sequences them lived only in a local Claude planning file. Future contributors (and resumed sessions) now see the whole arc in one repo-resident document. The roadmap captures: - Why Phase 0 (research-validation pass) precedes Phase 2 - cheap insurance that produced 10 deltas to the extraction-pipeline plan including a missing causation rule, missing § 145.64 peer-review hard-refuse, and missing Plutshack/Smith 4-cut tagging. - Why Phase 1 (templates) precedes Phase 2 (extraction) - exercises the Supabase-branch + schema-migration toolchain on a 1-column reversible change before the 3-table change. - Four hidden dependencies surfaced from codebase exploration that the per-phase plans don't address: missing backend/migrations/ directory, text-only LLM provider abstraction, no bbox-extraction primitive, free-plan Supabase branching limit. - Phase status (0 + 1 shipped at 3a97315 and 73faac0; 2/3/4 pending). - Concrete Phase 2 first steps + the multimodal/bbox sub-tasks hidden inside the multi-day estimate. - Phase 3 architectural decision (event log stays narrow; per-schema extractors consult it rather than expanding the log). - Phase 4 v1.1 deadline-tracking surface tied to Phase 2's document_events.key_date_role column. - Cross-phase risk register, out-of-scope list with permanent architectural separations (causation-chain reasoning, provider-defendant entity resolution). PDF rendered via reconstructed /tmp/jenn-mike/md_to_pdf.py.
`58f57660`	feat(extraction): Phase 2 med-mal extraction pipeline (#2)	Ryan Choi	2026-05-11	↗ GitHub
commit body Postgres (`0002`-`0005`), `patch_document_extraction_run` with GRANTs, per-page Claude JSON extraction, raster + vision for empty text layers, §145.64 vision-page peer-review prescan (compliance gate), optional queue mode (`EXTRACTION_ASYNC_MODE=queue`), REST + UI + chat tools, Vitest and Supertest coverage (403/404/409), backend CI workflow. Closes the §145.64 compliance gap: scanned pages with peer-review marker phrases visible only in the raster are now detected and halt the run before any event-extraction call. Follow-up: apply `0005_extraction_async_jobs_document_index.sql` on each Supabase environment and verify advisor `unindexed_foreign_keys` clears.
`ddee1416`	feat(timeline): med-mal chronology view backed by event log (#3)	Ryan Choi	2026-05-11	↗ GitHub
commit body Phase 3 of the med-mal records platform. Surfaces the Phase 2 document_events log as a fast, citation-anchored chronology view - satisfying the roadmap §Phase 3 verification gate of "+ Medical Chronology on extracted doc loads in <2s, no LLM call." Changes: - New route /projects/[id]/timeline/[docId] rendering events as a sortable table with row-click bbox sync against the embedded PDF preview. Reads the existing GET /extraction/:documentId/events; zero backend changes. - Widen MedMalDocumentEvent type to expose provider, provider_role, event_time, episode_of_care, key_date_role, event_date_text. - + Medical Chronology on ProjectPage routes to Timeline when ≥1 PDF is ready: single PDF goes straight, multiple opens a picker, zero PDFs falls back to the existing tabular-create modal. - docs/PLAN_med_mal_phase3_integration.md documents the row-cardinality finding that ruled out the originally-planned Rail B (cells-from-event-log for MAR / vitals / etc.) - every event-log-backed schema is per-encounter, which doesn't fit the 1-row-per-doc tabular_cells model. Future MAR / vitals / labs views are deferred to Phase 3.5 as their own Timeline-style surfaces. Verification: builds clean on backend + frontend; lint clean for new file. Browser-clicked e2e is left to the user - the gate ("<2s, no LLM traffic") needs a running stack + an extracted doc to exercise, neither of which the agent had access to during this PR.
`e5cf4ca6`	chore(extraction): prototype MedGemma 27B normalization pass	Ryan Choi	2026-05-11	↗ GitHub
commit body Standalone tsx script piping synthetic page-text through MedGemma 27B-IT via LM Studio's OpenAI-compatible endpoint, mirroring the medMalExtractor prompt shape. Validates clinical-field quality and JSON/shape stability ahead of deciding whether to wire it in as a stage-2 normalization model.

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-418.md from inside the repo you want the changes in.

⬇ Download capture-thread-418.md