Glossary

What is prompt injection?

Prompt injection is an attack where untrusted input manipulates an LLM into ignoring its original instructions or revealing protected data.

Definition

Prompt injection — short definition

Prompt injection: Prompt injection is an attack where untrusted input manipulates an LLM into ignoring its original instructions or revealing protected data.

Why it matters

Why this matters

Any AI feature that mixes user input with system instructions is at risk. Prompt injection can leak system prompts, escalate privileges, or exfiltrate context.

How it works

How it works

  1. Step 1

    Untrusted input arrives

    User input or third-party content reaches the prompt context.

  2. Step 2

    Instructions are overridden

    Crafted text persuades the model to ignore prior instructions.

  3. Step 3

    Defense layers run

    Prompt security at the gateway plus PII masking limit blast radius.

Implementation

Learn how this works in Privian

From definition to implementation, docs and architecture — the same idea at different layers.

FAQ

Frequently asked questions

How is prompt injection different from SQL injection?
Both abuse the lack of a hard boundary between code and data. In prompt injection the 'code' is natural-language instructions, which makes detection harder.
Can prompt injection be fully prevented?
No single technique fully prevents it. The practical answer is layered defense: input handling, prompt security at the gateway, and minimizing what sensitive data is ever in scope.
Does masking PII help against prompt injection?
Yes — even if an attacker bypasses instruction guarding, masked values mean the model never had access to the raw data in the first place.