Architecture

The Privian gateway is a single endpoint that runs auth, validation, deterministic masking, provider routing, and rehydration in a fixed order. This page describes that order and what crosses each boundary.

Request flow

text
client app
   │
   ▼
POST https://api.privian.io/v1/gateway        ← raw prompt enters trust boundary
   │
   ▼
[1] body cap (default 256 KB)
[2] auth (sha256 lookup of gateway API key)
[3] request schema validation (prompt, model, stream)
[4] model allowlist check
[5] rate limit + quota
   │
   ▼  (raw prompt still in memory only)
[6] deterministic PII / secret detector
[7] mask → placeholders (EMAIL_1, PERSON_1, …) + per-request token map
   │
   ▼  (only the masked prompt leaves the trust boundary)
[8] provider router → BYOK credential resolved server-side
[9] provider call (OpenAI / Anthropic / Google / …)
   │
   ▼
[10] rehydrate placeholders in provider response
[11] discard token map
   │
   ▼
client app                                    ← rehydrated response + diagnostic meta

Trust boundaries

  • Enters Privian: the raw prompt and your gateway API key (over TLS).
  • Leaves Privian toward providers: the masked prompt and the relevant decrypted BYOK provider key. Never the raw prompt.
  • Returns to client: the rehydrated response plus a sanitized meta block.

Layering

  • _core — orchestrator, masking, rehydration, provider routing abstractions. Portable, no host dependencies.
  • _transport — auth, validation, rate limit, quota, BYOK credential resolution, observability outbox.
  • Composition root — wires concrete sinks and adapters; no business logic.

What is stored

See Zero retention. In summary:

  • Stored: SHA-256 of gateway keys, encrypted BYOK credentials, sanitized observability events, rollup metrics.
  • Not stored: raw prompts, raw entity values, token maps, rehydrated responses, plaintext provider keys.

Failure handling

  • Errors in auth, validation, quota, or rate limiting fail before any provider call.
  • Provider errors may trigger a single bounded fallback to an eligible secondary model. Fallback usage is reflected in meta.fallbackUsed.
  • Observability sink failures never affect request outcome.

Beta limitations

  • Native provider streaming is not exposed; stream: true uses artificial chunking on the already-rehydrated response.
  • Self-service gateway API key creation is in beta; some operator setup is still manual.
  • No durable outbox relay yet — sanitized events live in the gateway outbox table and rollups.

Related