nwhitehouse rebuilds Mike's bulk-document engine for scale
Tabular review - the bit that runs the same question across thousands of documents - now survives tab closes, backend restarts, and the kind of proxy timeouts that used to kill a four-hour run.
The old setup held a single live connection open between browser and server for the entire run. Close the tab, lose your wifi, restart the server, or hit a proxy timeout - the whole job died and you started over. Fine for 50 documents; useless at the 5,000-10,000 scale nwhitehouse is targeting.
The rewrite turns each run into a durable job stored in the database, with a pool of workers picking up documents one at a time and a lease mechanism so nothing gets dropped or double-processed if a worker dies mid-task. The frontend now polls for progress and reconnects automatically. One tradeoff the author flags openly: you no longer see cells fill in word-by-word - each document's row appears as a block when its worker finishes.
Spotted something wrong? Or know the PR text has fresher detail than the writeup above?