Article · PII Masking

PII redaction vs. PII masking

Redaction destroys data. Masking preserves structure. The choice changes what the model can do — and whether the response is usable.

6 min read · Updated May 20, 2026

Two words that mean different things

Redaction destroys data. Masking preserves structure. Both remove the sensitive value from view, but the downstream consequences are very different.

Redaction

Redaction replaces a sensitive value with a generic marker:

Original:
  "Contact Jane Doe at jane@example.com about her IBAN."

Redacted:
  "Contact [REDACTED] at [REDACTED] about her [REDACTED]."

The data is gone. There is no way to recover it from the output. This is correct behavior for legal disclosure, public archives, or any scenario where the original value should not exist anymore.

For an LLM workload, redaction has a problem: the model loses the ability to distinguish entities. Two different customers in the same prompt both become [REDACTED]. The response is less useful, sometimes useless.

Masking

Masking replaces a sensitive value with a typed placeholder that preserves identity:

Original:
  "Contact Jane Doe at jane@example.com about her IBAN."

Masked:
  "Contact PERSON_1 at EMAIL_1 about her IBAN_1."

The model can still reason about PERSON_1's email, and a different customer in the same prompt gets a different placeholder. The mapping is held by the gateway in memory and used to rehydrate the response, so your application gets a usable answer back.

When each one wins

  • Use redaction when the original value should not be recoverable — e.g. publishing a document, sharing a log with a partner, archiving a transcript.
  • Use masking when you need the response to be useful — e.g. drafting a reply, summarizing a thread, answering a question about a real customer.

What "reversible" actually means here

Masking is reversible within a single request because the gateway holds the mapping in memory. It is not reversible across requests — the mapping is discarded after each call. This is a deliberate choice. A persistent mapping would mean a persistent index of who was masked to what, which is exactly the kind of data store we want to avoid.

How Privian fits

Privian implements masking with deterministic per-request placeholders. The data leaves your perimeter as tokens, the provider responds in tokens, and the gateway rehydrates the response before handing it back. See Rehydration for the return-path detail.

Try Privian during beta

Protect prompts before they reach GPT, Claude and other models.

BYOK · Zero retention · Provider-agnostic. Privian is currently in beta — pricing and limits may change.

FAQ

Frequently asked questions

Which one should I use for an LLM workload?
Masking with deterministic placeholders, in almost every case. Redaction is appropriate when you need to publish or share data and never get the original back.
Is masking reversible?
Within a single request, yes — the gateway holds the mapping in memory and uses it to rehydrate the response. Across requests, no. The mapping is discarded after each call.
Does redaction help with compliance?
It helps with disclosure scenarios where you genuinely want the data gone. For active workloads where you need the response to mean something, redaction makes the output unusable.