amal66 is teaching Mike's AI calls to fail gracefully

Two changes stop a flaky AI provider from freezing the app or surfacing errors that didn't need to surface.

infrastructureintegration

When a legal-AI tool sends a question to an outside AI service, two things can go wrong: the service stalls and never answers, or it hiccups with a temporary error. amal66's work handles both. The first change caps how long Mike will wait on a silent provider before giving up, so a stalled connection can't hang open and tie up the app indefinitely.

The second is smarter. Instead of blindly retrying every failure, it sorts them first - automatically retrying the temporary ones (rate limits, brief outages, timeouts) while leaving genuine errors like a bad request or a login problem alone. The result is fewer spurious failures landing in front of users, without papering over real problems that need attention.

So what Anyone running Mike in production should care: this is the difference between an AI feature that quietly rides out a provider's bad minute and one that breaks in front of a client.

View this fork on GitHub →

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

2 commits from amal66/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA Subject Author Date
54dcdf77 fix(chapter-13): time-limit stalled LLM streams Amal 2026-05-24 ↗ GitHub
commit body
Chapter: 13 - Bounded external calls.

Plain-English map:
Add a three-minute timeout around LLM server-sent-event streams so a stalled
provider call cannot hold a connection open forever.

Why it matters:
External services can hang. Without a timeout, one stuck call can consume
server resources until something outside the app kills it.

Principle:
Every external dependency call should have a clear time boundary.

Precedent borrowed:
Upstream PR #112.

Upstream base: willchen96/mike@d39f580.
Original local commit: 8992d98.
b7ef398b feat(chapter-29): retry transient LLM provider failures Amal 2026-05-24 ↗ GitHub
commit body
Chapter: 29 - Provider resilience.

Plain-English map:
Wrap LLM provider calls with exponential-backoff retries for temporary errors
like rate limits, timeouts, and provider outages.

Why it matters:
LLM providers occasionally fail for reasons the user cannot fix. A short,
careful retry can turn a temporary outage into a normal response.

Principle:
Classify external failures before retrying. Retry transient problems, not bad
requests or authentication errors.

Precedent borrowed:
The fork report's alternative-provider cluster and provider-abstraction work
across active forks.

Upstream base: willchen96/mike@d39f580.
Original local commit: 1417930.

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-589.md from inside the repo you want the changes in.

⬇ Download capture-thread-589.md