fix(chapter-24): spotlight untrusted content for prompt-injection defense

↗ view on GitHub · Amal · 2026-05-24 · 761f6129

Chapter: 24 - LLM threat modeling.

Plain-English map:
Fence document text, filenames, and other untrusted content with nonce-marked
spotlighting so the model can better separate data from instructions.

Why it matters:
Legal documents can contain malicious or simply confusing text. The model
should be told which text came from the user, which came from a document, and
which instructions are trusted.

Principle:
An LLM is not a security boundary. Prompts should preserve provenance and make
untrusted content explicit.

Precedent borrowed:
Upstream PR #158 and the threat model documented in `docs/SECURITY-MODEL.md`.

Upstream base: willchen96/mike@d39f580.
Original local commit: bededdd.
Repository amal66/mike
Author Amal <mamalanand3@gmail.com>
Authored
Parents 41ede550
Stats 5 files changed , +148 , -14
Part of Prompt-injection defense via spotlighting

Capture this commit into my fork

Download a Markdown prompt that tells Claude how to port this exact commit into your working tree. Run it via claude -p < capture-commit-761f6129.md from inside the repo you want the change in.

⬇ Download capture-commit-761f6129.md