Which one should I use for an LLM workload?

Masking with deterministic placeholders, in almost every case. Redaction is appropriate when you need to publish or share data and never get the original back.

Is masking reversible?

Within a single request, yes — the gateway holds the mapping in memory and uses it to rehydrate the response. Across requests, no. The mapping is discarded after each call.

Does redaction help with compliance?

It helps with disclosure scenarios where you genuinely want the data gone. For active workloads where you need the response to mean something, redaction makes the output unusable.

PII redaction vs. PII masking

Two words that mean different things

Redaction destroys data. Masking preserves structure. Both remove the sensitive value from view, but the downstream consequences are very different.

Redaction

Redaction replaces a sensitive value with a generic marker:

Original:
  "Contact Jane Doe at jane@example.com about her IBAN."

Redacted:
  "Contact [REDACTED] at [REDACTED] about her [REDACTED]."

The data is gone. There is no way to recover it from the output. This is correct behavior for legal disclosure, public archives, or any scenario where the original value should not exist anymore.

For an LLM workload, redaction has a problem: the model loses the ability to distinguish entities. Two different customers in the same prompt both become [REDACTED]. The response is less useful, sometimes useless.

Masking

Masking replaces a sensitive value with a typed placeholder that preserves identity:

Original:
  "Contact Jane Doe at jane@example.com about her IBAN."

Masked:
  "Contact PERSON_1 at EMAIL_1 about her IBAN_1."

The model can still reason about PERSON_1's email, and a different customer in the same prompt gets a different placeholder. The mapping is held by the gateway in memory and used to rehydrate the response, so your application gets a usable answer back.

When each one wins

Use redaction when the original value should not be recoverable — e.g. publishing a document, sharing a log with a partner, archiving a transcript.
Use masking when you need the response to be useful — e.g. drafting a reply, summarizing a thread, answering a question about a real customer.

What "reversible" actually means here

Masking is reversible within a single request because the gateway holds the mapping in memory. It is not reversible across requests — the mapping is discarded after each call. This is a deliberate choice. A persistent mapping would mean a persistent index of who was masked to what, which is exactly the kind of data store we want to avoid.

How Privian fits

Privian implements masking with deterministic per-request placeholders. The data leaves your perimeter as tokens, the provider responds in tokens, and the gateway rehydrates the response before handing it back. See Rehydration for the return-path detail.

Written under our editorial principles: implementation-grounded, honest about limitations, educational first.

Frequently asked questions

Which one should I use for an LLM workload?: Masking with deterministic placeholders, in almost every case. Redaction is appropriate when you need to publish or share data and never get the original back.
Is masking reversible?: Within a single request, yes — the gateway holds the mapping in memory and uses it to rehydrate the response. Across requests, no. The mapping is discarded after each call.
Does redaction help with compliance?: It helps with disclosure scenarios where you genuinely want the data gone. For active workloads where you need the response to mean something, redaction makes the output unusable.