Glossary
What is prompt injection?
Prompt injection is an attack where untrusted input manipulates an LLM into ignoring its original instructions or revealing protected data.
Definition
Prompt injection — short definition
Prompt injection: Prompt injection is an attack where untrusted input manipulates an LLM into ignoring its original instructions or revealing protected data.
Why it matters
Why this matters
Any AI feature that mixes user input with system instructions is at risk. Prompt injection can leak system prompts, escalate privileges, or exfiltrate context.
How it works
How it works
Step 1
Untrusted input arrives
User input or third-party content reaches the prompt context.
Step 2
Instructions are overridden
Crafted text persuades the model to ignore prior instructions.
Step 3
Defense layers run
Prompt security at the gateway plus PII masking limit blast radius.
Implementation
Learn how this works in Privian
From definition to implementation, docs and architecture — the same idea at different layers.
FAQ
Frequently asked questions
- How is prompt injection different from SQL injection?
- Both abuse the lack of a hard boundary between code and data. In prompt injection the 'code' is natural-language instructions, which makes detection harder.
- Can prompt injection be fully prevented?
- No single technique fully prevents it. The practical answer is layered defense: input handling, prompt security at the gateway, and minimizing what sensitive data is ever in scope.
- Does masking PII help against prompt injection?
- Yes — even if an attacker bypasses instruction guarding, masked values mean the model never had access to the raw data in the first place.