Article · AI Privacy
Why redaction alone is not enough for AI privacy
Redaction is one tool. AI privacy is a layered problem: redaction, masking, minimization, retention controls, provider boundaries and governance each do something different — and none is a silver bullet.
"Redaction" is often used as shorthand for AI privacy, especially in vendor materials. It is a useful tool. It is not a complete answer. AI privacy is a layered problem and treating one layer as the whole picture is how organizations end up with confident marketing claims and quiet exposure incidents.
This article walks through the layers honestly — what each one does, what it does not do, and how they combine. Privian is one layer. Saying so plainly is the point.
Layer 1 — Redaction
Redaction removes sensitive content from a payload. A name becomes [REDACTED]; an email becomes ***. It is easy to reason about and hard to misuse: whatever was redacted is gone.
What redaction does well: forensic logs, exports, after-the-fact scrubbing of stored payloads. What it does poorly: live LLM prompts. The model loses the ability to distinguish "the customer" from "the agent" if both are [REDACTED], and the response cannot be restored on the way back. Redaction treats data as something to destroy, not something to preserve and protect.
Layer 2 — Masking
Masking is the LLM-aware cousin of redaction. Sensitive values are replaced with deterministic, type-aware placeholders — PERSON_1, EMAIL_2, CREDIT_CARD_1 — so the prompt keeps its structure and the response can be rehydrated. The model sees enough to reason; the provider never sees the underlying identifiers.
Masking covers patterns. It does not detect unstructured descriptions of sensitive content, and it does not change anything about the prompts a user chooses to write.
Layer 3 — Minimization
Minimization is upstream of both. It is the design discipline of asking, for every prompt template, "does the model need this field?" Most prompts include more than they need. Stripping a customer record down to the fields that actually drive the answer is the single highest-leverage AI-privacy move most teams have not made.
Minimization reduces the work redaction and masking have to do. No layer downstream can recover the privacy gain of not sending a field at all.
Layer 4 — Retention controls
Retention governs what happens to data after it lands somewhere. Provider retention windows, gateway logging policies, your own application's storage of prompts and responses — these are retention controls. They are necessary because some data does legitimately need to land somewhere; they do not affect what the provider sees in the first place.
Privian's own retention posture is zero raw retention: prompts and responses are not stored. Sanitized observability events and rollups are. See the security model.
Layer 5 — Provider boundaries
Different provider tiers and tenants offer different defaults for training opt-outs, retention windows and subprocessor disclosures. A redaction layer in your application means nothing if the prompt is then sent to a consumer endpoint that retains and trains on it. Provider boundaries — the contractual and configuration layer — decide what the provider is allowed to do with what you sent.
BYOK is the operational expression of this layer: your relationship, your terms, your key. See BYOK for privacy-sensitive AI.
Layer 6 — Governance
Acceptable-use policy, access controls, training, incident response, vendor reviews. The unglamorous layer that decides whether any of the technical layers are configured correctly in the first place. Governance is not a control; it is the system that produces controls.
How the layers fail in isolation
- Redaction only: the model loses the ability to do useful work, and responses cannot be restored.
- Masking only: the prompt is cleaner, but the provider may still retain or train on what you did send.
- Minimization only: templates drift over time and accumulate fields that nobody re-audits.
- Retention only: nothing changes about what reaches the model.
- Provider boundaries only: contracts do not enforce themselves when a developer pastes a customer record into a prompt at 4pm.
- Governance only: a document is not a control.
How they combine
A reasonable contemporary AI-privacy posture stacks several layers deliberately:
- Minimize fields at the prompt-template level.
- Mask remaining supported values at the gateway, with rehydration on the response.
- Send only to enterprise provider tenants with training opt-outs and short retention.
- Persist no raw payloads on the gateway path; keep sanitized observability for incident response.
- Wrap all of it in policy, access controls and periodic review.
Where Privian fits
Privian provides layers 2 and 4 honestly: prompt-level masking with rehydration, and zero raw retention on the gateway path. Combined with BYOK, it shapes the egress contents and the trust boundary. It does not replace minimization, governance, or provider-level configuration — those are choices you keep.
The honest framing matters because every "single silver bullet" narrative quietly increases risk by encouraging the layers it does not cover to atrophy. Privacy is layered. Privian is one layer.
Written under our editorial principles: implementation-grounded, honest about limitations, educational first.
Try Privian during beta
Protect prompts before they reach GPT, Claude and other models.
BYOK · Zero retention · Provider-agnostic. Privian is currently in beta — pricing and limits may change.
FAQ
Frequently asked questions
- What is the difference between redaction and masking?
- Redaction removes or blacks out sensitive values, leaving a placeholder like [REDACTED] that loses identity. Masking replaces them with deterministic, type-aware placeholders such as PERSON_1 and EMAIL_1, which keep structure and reference so the model can still reason about the prompt — and the response can be restored.
- Is data minimization the same as masking?
- No. Minimization is a design choice about what to put into the prompt in the first place. Masking is a runtime control applied to whatever ended up in the prompt. Strong minimization makes masking easier; masking does not replace minimization.
- Why not just rely on the provider's retention controls?
- Provider controls govern what the provider does once it has the data. They do not change what data the provider receives. If a prompt contains a customer's full name and account number, no provider control un-sends those values.
- So what is the silver bullet?
- There isn't one. AI privacy is layered — redaction, masking, minimization, retention controls, provider boundaries, BYOK and governance each address different parts of the problem. Mature programs use several at once.
More articles
Continue reading
AI Privacy
GDPR and LLMs, explained
What GDPR means for teams using GPT, Claude and other managed LLMs — personal data in prompts, provider boundaries, retention, and the technical controls teams adopt in practice.
AI Privacy
How to reduce sensitive data in LLM prompts
A practical guide for shrinking the sensitive-data footprint of summarization, drafting, support and copilot prompts — with realistic before/after examples and honest limitations.
AI Privacy
BYOK for privacy-sensitive AI
Bring-your-own-key explained for teams with privacy and procurement requirements: what BYOK changes about billing, provider boundaries and trust — and what it does not solve.