Gadoes builds a scraper framework for jurisdictions the APIs forgot

When there's no off-the-shelf data feed for a body of law, scrape it yourself - and make that repeatable.

integrationknowledge-management

Most legal-AI tools lean on tidy commercial APIs to pull in case law. That works for the big English-language jurisdictions and falls apart everywhere else. Gadoes has built a small framework inside the dispumike fork that treats web-scraping as a first-class source: a base scraper contract, a freshness tracker that knows when corpora go stale, and an adapter that makes the scraped output look identical to a proper data feed downstream.

To prove the pattern works, the fork ships two concrete implementations: italaw, covering Italian law, and ICSID, the international investment-arbitration awards database. Neither has a friendly API. Both now plug into the rest of the system as if they did. The framework is deliberately separable from the scrapers themselves, so other forks can adopt the plumbing without inheriting Italy or ICSID.

So what Worth a look for any legal-tech team trying to cover jurisdictions or tribunals where the source material lives on a website and nowhere else.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

6 commits from Gadoes/dispumike, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
5512f4f2 Chunk 14: Portable DB Framework (BaseScraper, PortableMcpServer, FreshnessManager) Gadoes 2026-05-02 ↗ GitHub
610e68de Merge Chunk 14: Portable DB Framework Gadoes 2026-05-02 ↗ GitHub
6500aec7 Chunk 15: italaw Portable DB (scraper, server, fixtures, test) Gadoes 2026-05-02 ↗ GitHub
a2e9ca95 Merge Chunk 15: italaw Portable DB Gadoes 2026-05-02 ↗ GitHub
a37fcd0c Chunk 16: ICSID Portable DB (scraper, server, fixtures, test) Gadoes 2026-05-02 ↗ GitHub
aa2d8c12 Merge Chunk 16: ICSID Portable DB Gadoes 2026-05-02 ↗ GitHub

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-28.md from inside the repo you want the changes in.

⬇ Download capture-thread-28.md