In plain words
This page explains one common AI risk in plain terms and shows a safer default you can apply quickly.
What this risk looks like
Prompt injection happens when untrusted input changes model behavior in ways your application did not intend.
What can go wrong
- Data theft through model outputs
- Policy bypass in tool-using agents
- Unsafe actions triggered by hidden instructions
Safer patterns
- Treat model outputs as untrusted suggestions.
- Apply strict allowlists for tool actions.
- Separate system policies from user-controlled context.
Minimum control set
- Output validation and policy checks
- Sensitive-action confirmations
- Red-team prompts in CI and staging
AI builder reminder: Model output is not policy. Every sensitive action needs explicit guardrails.