What does the LLM provider actually see?

The provider receives the masked prompt only — sensitive values are replaced with deterministic placeholders such as PERSON_1 and EMAIL_1 before the outbound provider call. Original values are restored in-memory inside the gateway before the response is returned to your application.

Does Privian store raw prompts or responses?

No. Raw prompts, raw entity values, the per-request token map, and rehydrated responses are not persisted. Only sanitized observability counters, rollup metrics, hashed gateway API keys and encrypted BYOK credentials are stored.

How is the placeholder mapping handled?

Mappings are held in an in-memory EntityStore scoped to a single request and discarded once the response is rehydrated. There is no cross-request reuse and nothing is persisted.

Which providers and models are supported?

OpenAI, Anthropic, Google and DeepSeek via BYOK. Models are addressed using provider-namespaced identifiers such as openai/gpt-5.5 or anthropic/claude-sonnet-4-5.

Architecture

Privian architecture

How Privian masks sensitive data before prompts reach LLM providers and rehydrates responses before returning them to your application.

Read the docs API reference

Overview

A privacy-first LLM gateway

Privian is a privacy-first LLM gateway that sits between your application and LLM providers. Every prompt passes through one masking hop before egress: sensitive values are replaced with deterministic placeholders, the masked prompt is sent to the provider, and the response is rehydrated in-memory before it returns to your application. The provider never sees the original values.

The provider receives the masked prompt; the request-scoped mapping remains inside the gateway and is discarded after rehydration.

Request flow

End-to-end request flow

Prompt path through a privacy-first gateway— Original values never cross the BYOK boundary.

Framework

Gateway request sequence

01
Validate
Authenticate the request and apply schema, size, rate and quota checks.
02
Detect
Identify supported sensitive entities in prompt content.
03
Mask
Replace detected values with request-scoped placeholders.
04
Route
Call the selected provider with the customer's BYOK credential.
05
Rehydrate
Restore mapped values before returning the response.

The flow is fixed and ordered: validation → detection → masking → routing → provider call → rehydration. Errors in validation, auth, quota or rate limiting fail before any provider call is made.

Provider view

What the LLM provider sees

Original prompt (from your app)

Summarize the ticket: Customer Michael Olsen
emailed michael@example.com about ticket #4821
and mentioned card 4111 1111 1111 1111.

Masked prompt (sent to provider)

Summarize the ticket: Customer PERSON_1
emailed EMAIL_1 about ticket #4821
and mentioned card CREDIT_CARD_1.

Placeholders are deterministic within a request — the same value reuses the same token — so the model can still reason about structure and reference, but never sees the underlying identifier.

Rehydration

Restoring values on the response

When the provider returns its response, Privian replaces every placeholder with the corresponding original value using the per-request token map. The map is held in an in-memory EntityStore scoped to that single request and is discarded once rehydration completes. There is no cross-request reuse and nothing is persisted.

BYOK

BYOK and provider routing

Privian uses a Bring Your Own Key model. You add provider credentials in the dashboard; Privian routes each request to the right provider based on a namespaced model identifier:

openai/gpt-5.5 → routed to OpenAI with your OpenAI key
anthropic/claude-sonnet-4-5 → routed to Anthropic with your Anthropic key

Provider credentials are encrypted with AES-256-GCM at rest. The plaintext key is discarded immediately after encryption. Only safe metadata leaves the server — the last four characters and a non-reversible HMAC fingerprint — never ciphertext, IV, or plaintext.

Data handling

What is stored, and what isn't

Data type	Used for	Stored?	Notes
Raw prompts	Detection & masking in-memory	No	Discarded after the request returns.
Masked prompts	Sent to the LLM provider	No	This is the only prompt form the provider sees.
Token map (placeholder → value)	Rehydration	No	In-memory EntityStore, scoped to a single request.
Rehydrated responses	Returned to the client	No	Not persisted by the gateway.
BYOK provider credentials	Outbound provider call	Yes — encrypted (AES-256-GCM) at rest	Plaintext is discarded immediately after encryption. Metadata exposes last4 + fingerprint only.
Gateway API keys	Authenticating your requests	Yes — SHA-256 hash only	Plaintext keys are shown once at creation and never recoverable.
Usage / observability events	Counters, latencies, rollups	Yes — sanitized, no payload content	Entity counts, model, status, timing. No raw text, no entity values.

Detection

Supported masking types

The current detector covers the following entity types out of the box:

PERSON— Personal names
EMAIL— Email addresses
PHONE— Phone numbers
IP_ADDRESS— IPv4 and IPv6 addresses
CREDIT_CARD— Payment card numbers (Luhn-validated)
IBAN— International bank account numbers
SSN_US— US Social Security numbers
SIN_CA— Canadian SIN
JWT— JSON Web Tokens
OPENAI_API_KEY— OpenAI API keys
GITHUB_TOKEN— GitHub personal/access tokens
AWS_ACCESS_KEY_ID— AWS access key IDs
AWS_SECRET_ACCESS_KEY— AWS secret access keys
GENERIC_API_KEY— Generic API-key patterns
ENV_SECRET— Env-style secret assignments
SECRET_TOKEN— Other secret-like tokens

Norwegian fødselsnummer is not yet supported. Custom user-defined entity types are not yet supported.

Transparency

Current beta limitations

No OpenAI messages[] API
Privian accepts a flat prompt string, not OpenAI-style messages arrays.
No OpenAI SDK drop-in
Integrate via a small JSON contract over HTTPS. SDK compatibility is not yet exposed.
No tool / function calling
Tool and function calling are not supported in the current beta.
No native provider streaming
stream: true uses artificial chunking over the already-rehydrated response.
No custom entity types yet
The detector covers a fixed set of built-in entity types.
No Norwegian fødselsnummer
Planned, not yet implemented.
No prompt-injection claim
Privian does not currently claim to block prompt injection or jailbreaks.

Architecture decisions

Why the architecture looks like this

A short, opinionated list of the structural decisions behind Privian and the reasoning grounded in the implementation.

Masking happens before the model call

Once a prompt has left the gateway, the provider has seen it. Any masking applied after egress is theatre. The orchestrator enforces a fixed order — validate, detect, mask, route — so there is no code path where a raw prompt reaches a provider.

Rehydration happens inside the gateway, not the client

Doing rehydration server-side means the client never needs the token map, the map never crosses a network boundary, and the request-scoped EntityStore can be discarded the moment the response is returned.

BYOK was chosen over pooled keys

Pooled keys make Privian a subprocessor of every customer's provider relationship. BYOK keeps the contract, the billing and the retention terms with the customer, which is what enterprise reviews increasingly require.

Provider-agnostic routing, not vendor lock-in

Models are addressed by namespaced identifiers (openai/…, anthropic/…). Switching providers is a model-id change, not a rewrite. Privian is a routing layer, not a re-aggregator.

Zero raw retention by design

The observability path runs every event through a sanitizer that drops payload-shaped fields and clips long strings before any sink writes it. The system cannot accidentally store a prompt because the sinks never receive one.

Fail-closed on uncertainty

Unsupported request shapes, unknown models, and validation failures return an error before any provider is contacted. The default behavior on ambiguity is to refuse, not to forward.

LiteLLM in an execution-only role

Routing decisions, model resolution and credential lookup happen in Privian. LiteLLM handles the outbound HTTP call. Splitting concerns keeps the security-relevant logic in one place.

Deterministic placeholders over random tokens

Deterministic, type-aware placeholders (PERSON_1, EMAIL_2) preserve structure and reference for the model. Random tokens would lose the relationship between repeated mentions of the same entity in the same prompt.

FAQ

Frequently asked questions

What does the LLM provider actually see?: The provider receives the masked prompt only — sensitive values are replaced with deterministic placeholders such as PERSON_1 and EMAIL_1 before the outbound provider call. Original values are restored in-memory inside the gateway before the response is returned to your application.
Does Privian store raw prompts or responses?: No. Raw prompts, raw entity values, the per-request token map, and rehydrated responses are not persisted. Only sanitized observability counters, rollup metrics, hashed gateway API keys and encrypted BYOK credentials are stored.
How is the placeholder mapping handled?: Mappings are held in an in-memory EntityStore scoped to a single request and discarded once the response is rehydrated. There is no cross-request reuse and nothing is persisted.
Which providers and models are supported?: OpenAI, Anthropic, Google and DeepSeek via BYOK. Models are addressed using provider-namespaced identifiers such as openai/gpt-5.5 or anthropic/claude-sonnet-4-5.

Keep reading

Blueprint

Privian Blueprint

Trust

Related trust resources

The same picture from different angles — procurement-friendly references.