LoopController: escalate on step count, repeated calls, or wall-clock timeout

nwhitehouse added a 138-line controller that wraps the tool-dispatch callback in `runLLMStream` and escalates when the agent loop runs too long, repeats the same call three times, or passes 60 seconds of wall-clock time.

infrastructureworkflow

lib/loopController.ts tracks three independent triggers, first-wins: MAX_STEPS_EXCEEDED (total tool dispatches in the turn, default 12, env OLAVA_MAX_STEPS), REPEATED_TOOL_CALL (same name and arguments called three times in a row, env OLAVA_MAX_REPEATED_CALLS), and WALL_CLOCK_EXCEEDED (60 seconds since the turn started, env OLAVA_WALL_CLOCK_MS). Steps are counted per individual dispatch, not per batch, so a single iteration with four tool calls counts as four steps.

On escalation, the controller appends an escalationNote to every tool result in the current batch - a plain-text instruction asking the model to stop calling tools and synthesize the best answer from what it already has. The model still receives the data it just fetched. It also emits a loop.escalated row to the feat-015 audit log. The chatTools.ts integration wraps the existing runTools callback.

The three defaults feel calibrated for a lightweight legal assistant: 1-3 tool calls is typical, 12 is generous, and the 3-repeat trigger specifically targets the "model retrying a failing tool" failure mode. All three are overridable per-deploy via env vars without a code change.

tests/loopController.test.ts (72 LOC) covers each trigger, the negative case where args differ, currentStep accounting, and escalation note formatting. All 16 backend tests pass. The class is independent of the chat code, which is why it's testable at this level.

This is stacked on feat-015 (uses recordEvent). The chatId parameter it depends on was introduced in feat-017.

So what Worth importing as a defensive layer on any agent loop - the logic is pure and easy to lift. The "append to tool results" approach is less disruptive than killing the turn outright, though it does subtly change the model's input, which could surprise downstream prompt logic that keys on result shape. The three default thresholds are reasonable for a small reasoning model; tune via env if your deployment runs longer tool chains.

Spotted something wrong? Or know the PR text has fresher detail than the writeup above?

Commits in this thread

1 commit from nwhitehouse/mike, oldest first. Source extracted verbatim from the harvested git log.

SHA	Subject	Author	Date
`22ab8a76`	[feat-014] LoopController: bound tool dispatches per turn	Nick Whitehouse	2026-05-07	↗ GitHub
commit body Wraps the runTools callback in runLLMStream with a small controller that escalates on three triggers (first one wins): - MAX_STEPS_EXCEEDED - total tool dispatches >= 12 (env: OLAVA_MAX_STEPS) - REPEATED_TOOL_CALL - same name+args 3× (env: OLAVA_MAX_REPEATED_CALLS) - WALL_CLOCK_EXCEEDED - > 60s since turn start (env: OLAVA_WALL_CLOCK_MS) On escalation the controller appends a "stop calling tools and synthesise the best answer you can" note to every tool result in the batch. The model still receives the data it just fetched - we only ask it to stop reaching for more. Combined with the existing maxIterations: 10 in streamChatWithTools this is belt-and-suspenders against runaway loops. Also emits a loop.escalated event to the feat-015 audit log so post-hoc "why did this turn behave weirdly?" questions are answerable from SQL. Class is independent of chat code - 7 unit tests in loopController.test.ts cover each trigger + the negative case where args differ + currentStep accounting + escalation note formatting. All 16 backend tests pass (7 new + 9 existing security regressions). Stacked on feat-015 (uses recordEvent for the audit row). Will rebase cleanly onto main after feat-017 + feat-015 land. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

SHA

Subject

Author

Date

22ab8a76

[feat-014] LoopController: bound tool dispatches per turn

Nick Whitehouse

2026-05-07

↗ GitHub

commit body

Wraps the runTools callback in runLLMStream with a small controller that
escalates on three triggers (first one wins):
  - MAX_STEPS_EXCEEDED   - total tool dispatches >= 12 (env: OLAVA_MAX_STEPS)
  - REPEATED_TOOL_CALL   - same name+args 3× (env: OLAVA_MAX_REPEATED_CALLS)
  - WALL_CLOCK_EXCEEDED  - > 60s since turn start (env: OLAVA_WALL_CLOCK_MS)

On escalation the controller appends a "stop calling tools and synthesise
the best answer you can" note to every tool result in the batch. The model
still receives the data it just fetched - we only ask it to stop reaching
for more. Combined with the existing maxIterations: 10 in streamChatWithTools
this is belt-and-suspenders against runaway loops.

Also emits a loop.escalated event to the feat-015 audit log so post-hoc
"why did this turn behave weirdly?" questions are answerable from SQL.

Class is independent of chat code - 7 unit tests in loopController.test.ts
cover each trigger + the negative case where args differ + currentStep
accounting + escalation note formatting. All 16 backend tests pass
(7 new + 9 existing security regressions).

Stacked on feat-015 (uses recordEvent for the audit row). Will rebase
cleanly onto main after feat-017 + feat-015 land.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Capture this thread into my fork

Download a single Markdown prompt that tells Claude how to port every commit above into your working tree — adapting paths and structure to match your repo. Run it via claude -p < capture-thread-116.md from inside the repo you want the changes in.