Direct identifiers
Names, emails, phone numbers, addresses — frequently included so the model can write a personalized reply.
Pillar · EU privacy
Reducing prompt-level sensitive-data exposure when teams use GPT, Claude and other managed models.
An educational reference for engineering, security and platform teams in the EU. Privian helps reduce one specific risk — sensitive values entering prompts that reach third-party providers. It does not provide legal advice or compliance guarantees.
Definition
GDPR does not ban the use of large language models. It does require that any personal data processed through an AI workflow has a lawful basis, an appropriate data-processing arrangement with the provider, and reasonable technical and organizational measures to protect it.
For an LLM-powered application, that usually translates into a handful of concrete questions: what personal data ends up in the prompt, where does that prompt go, who retains what, and how quickly can the data flow be changed if something goes wrong.
This page is educational. It is not legal advice. It describes patterns teams use in practice and where a privacy-first LLM gateway like Privian fits in.
Why it matters
The prompt is the new data-export surface. Anything in the prompt is, by definition, sent to a third party. For most teams this is the first place where personal data crosses an organisational boundary without going through the usual review.
Prompts also have unusual properties: they are often built dynamically from records, they frequently include free-text written by end users, and they are shaped by individual developers under deadline. A field that nobody intended to expose can end up in a prompt simply because it was on the same object as something that was needed.
Exposure surface
Names, emails, phone numbers, addresses — frequently included so the model can write a personalized reply.
Customer IDs, order numbers, internal references that map back to a person on the receiving end of a support workflow.
Support tickets, form submissions and chat transcripts that may contain anything from health details to payment context.
Employee names, internal hostnames, project codenames and other organizational data that leak business context.
API keys, JWTs and tokens that end up in debugging or code-review prompts without anyone noticing.
Whole documents pasted into a prompt to summarize, classify or extract — often containing more than the user realises.
Organizational reality
Policy and training matter. They set expectations and create the shared vocabulary a team needs. In practice, teams that handle sensitive data also add technical controls — not because their colleagues are careless, but because people inevitably paste information into tools that help them move faster.
The same pattern shows up across other data-protection disciplines: secret scanning in source control, DLP in email, masking in analytics pipelines. Each one assumes that a written rule is necessary but not sufficient, and that a technical check is what catches the long tail.
Prompt-level controls follow the same logic. They do not replace policy or governance — they sit underneath them and handle the cases where intent and behavior diverge.
Controls
There is no single answer. Most privacy-sensitive AI stacks combine several of the following, weighted to the team's constraints:
Self-hosted or on-premise models
What it is: Run an open-weight model on infrastructure the team controls.
Tradeoff: Strongest data-residency story; significant operational cost, narrower model selection and slower access to frontier capabilities. Suitable for teams with the engineering bandwidth and a strict residency requirement.
Provider-side zero-retention / enterprise terms
What it is: Use provider features and contractual terms that limit or disable provider-side retention.
Tradeoff: Reduces provider-side persistence and may satisfy procurement, but does not affect what enters the prompt in the first place.
Prompt-level masking / redaction
What it is: Detect supported personal and sensitive values before the prompt leaves the application, and replace them with deterministic placeholders.
Tradeoff: Reduces the data sent to the provider for supported entity types. Does not catch what the detector does not recognize and does not address downstream provider behavior.
BYOK and provider isolation
What it is: Route requests using the organization's own provider API key, keeping the provider contract and billing inside the org.
Tradeoff: Improves trust boundaries and key rotation; does not prevent sensitive values from entering prompts on its own.
Model and tool restrictions
What it is: Allow only specific models, allow-list which services can call the gateway, and limit which features can post free-text from end users.
Tradeoff: Effective but requires inventory work and ongoing maintenance.
Policies, training and AI governance
What it is: Acceptable-use policies, employee training, incident processes, and an AI governance forum that reviews new use cases.
Tradeoff: Necessary foundation; insufficient on its own because individual prompts are not reviewed before they are sent.
Privian is designed for teams that want to keep using managed models — GPT, Claude, Gemini and others — while reducing sensitive-data exposure in the prompts they send. It is one layer in a broader stack, not a replacement for governance, self-hosting or provider-side controls.
Concept
Prompt-level data protection is a narrow idea: apply controls to the prompt itself, at the moment it is built or sent, rather than relying on what happens after it reaches the provider.
The mechanics are simple:
None of this is a substitute for data minimisation upstream of the prompt. The most reliable way to keep something out of a provider's hands is not to put it in the prompt at all.
Scope
Names, emails, phone numbers, addresses, account identifiers, payment data and developer secrets — replaced with deterministic placeholders before the prompt is forwarded.
Privian forwards requests using your provider API key. Your contract, your billing, your provider-side terms.
Privian persists structural metrics for observability — model, latency, masked entity counts — not raw prompt or response bodies.
All AI traffic flows through one endpoint, so masking and routing policy do not depend on every client doing the right thing.
Honest scope
Trust matters more than claims. Privian explicitly does not claim any of the following:
See the LLM security pillar for the broader picture, or the AI Security Layer category for the architectural framing.
Related reading
Trust
The same picture from different angles — procurement-friendly references.
FAQ