fix(chapter-24): spotlight untrusted content for prompt-injection defense
Chapter: 24 - LLM threat modeling. Plain-English map: Fence document text, filenames, and other untrusted content with nonce-marked spotlighting so the model can better separate data from instructions. Why it matters: Legal documents can contain malicious or simply confusing text. The model should be told which text came from the user, which came from a document, and which instructions are trusted. Principle: An LLM is not a security boundary. Prompts should preserve provenance and make untrusted content explicit. Precedent borrowed: Upstream PR #158 and the threat model documented in `docs/SECURITY-MODEL.md`. Upstream base: willchen96/mike@d39f580. Original local commit: bededdd.
| Repository | amal66/mike |
|---|---|
| Author | Amal <mamalanand3@gmail.com> |
| Authored | |
| Parents | 41ede550 |
| Stats | 5 files changed , +148 , -14 |
| Part of | Prompt-injection defense via spotlighting |
Capture this commit into my fork
Download a Markdown prompt that tells Claude how to port this
exact commit into your working tree. Run it via
claude -p < capture-commit-761f6129.md
from inside the repo you want the change in.