amal66/mike@761f6129

fix(chapter-24): spotlight untrusted content for prompt-injection defense

↗ view on GitHub · Amal · 2026-05-24 · 761f6129

Chapter: 24 - LLM threat modeling.

Plain-English map:
Fence document text, filenames, and other untrusted content with nonce-marked
spotlighting so the model can better separate data from instructions.

Why it matters:
Legal documents can contain malicious or simply confusing text. The model
should be told which text came from the user, which came from a document, and
which instructions are trusted.

Principle:
An LLM is not a security boundary. Prompts should preserve provenance and make
untrusted content explicit.

Precedent borrowed:
Upstream PR #158 and the threat model documented in `docs/SECURITY-MODEL.md`.

Upstream base: willchen96/mike@d39f580.
Original local commit: bededdd.

Repository	amal66/mike
Author	Amal <mamalanand3@gmail.com>
Authored	2026-05-24T13:12:18-07:00
Parents	`41ede550`
Stats	5 files changed , +148 , -14
Part of	Prompt-injection defense via spotlighting

Repository

amal66/mike

Author

Amal <mamalanand3@gmail.com>

Authored

2026-05-24T13:12:18-07:00

Parents

41ede550

Stats

5 files changed , +148 , -14

Part of

Prompt-injection defense via spotlighting

Capture this commit into my fork

Download a Markdown prompt that tells Claude how to port this exact commit into your working tree. Run it via claude -p < capture-commit-761f6129.md from inside the repo you want the change in.

⬇ Download capture-commit-761f6129.md