Reference
API Reference
The Privian gateway exposes one primary endpoint. This page documents every field of the request and response.
Section
Base URL
All requests target a single base URL:
https://api.privian.io/v1/gatewaySection
Authentication
Every request requires a gateway API key. Pass it as a bearer token or asx-api-key. SeeAuthentication.
Authorization: Bearer sk-gw_live_<random>- Keys start with
sk-gw_live_(production) orsk-gw_test_(non-production). - Privian stores only
sha256(key). - Missing or revoked keys return
401 unauthorized.
Section
Gateway endpoint
POST to the base URL with a JSON body. Content-Type must beapplication/json.
curl -sS -X POST https://api.privian.io/v1/gateway \
-H "Authorization: Bearer $PRIVIAN_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt":"Email john@acme.com about Friday.","model":"openai/gpt-5.5"}'Section
Request body
| Field | Type | Required | Notes |
|---|---|---|---|
prompt | string | yes | Max 32 KiB characters by default. |
model | string | yes | Provider-namespaced ID from the catalog (see Models). |
stream | boolean | no | Returns the rehydrated response as text/event-stream. |
Request body is capped at 256 KB by default; oversized requests return 413 payload_too_large.
Section
Response body
Successful responses return JSON:
{
"response": "Sent. I'll email John at john@acme.com about Friday.",
"model": "openai/gpt-5.5",
"meta": {
"providerId": "litellm",
"executionPath": "primary",
"fallbackUsed": false,
"timedOut": false,
"entitiesDetected": 2,
"latency": {
"totalMs": 612,
"maskingMs": 3,
"providerMs": 601,
"rehydrationMs": 1
},
"fallback": { "used": false, "reason": "not_attempted" }
}
}The meta block is diagnostic only. It never contains raw prompt content or raw entity values.
Section
Headers
x-request-id— present on every response (success and error). Quote it in bug reports.Retry-After— present on429responses, in seconds.
Section
Models
Send a real, provider-namespaced model ID. Examples:
openai/gpt-5.5openai/gpt-4o-minianthropic/claude-sonnet-4.5google/gemini-2.5-pro
Unknown or disabled IDs are rejected before any provider call with400 validation reason unsupported_model.
Section
Errors
All errors return JSON of this shape:
{ "error": { "code": "validation", "message": "unsupported_model" } }| Code | HTTP | Reached provider? | Recommended behavior |
|---|---|---|---|
unauthorized | 401 | No | Check the API key header. Do not retry as-is. |
forbidden | 403 | No | Key valid but not allowed for this resource. |
validation | 400 | No | Fix the payload (e.g. missing_prompt, unsupported_model). |
quota_exceeded | 402 | No | Stop until the window resets. |
rate_limited | 429 | No | Wait Retry-After seconds, retry with jitter. |
provider_timeout | 504 | Yes | Safe to retry once. |
provider_unavailable | 502 | Yes | Treat as transient. |
provider_rate_limit | 429 | Yes | Backoff, honor Retry-After. |
gateway_timeout | 504 | Maybe | Retry once with backoff. |
internal | 500 | Maybe | Report the x-request-id. |
Section
Provider routing
The model namespace (e.g. openai/) determines which provider Privian calls. The relevant BYOK credential is resolved server-side from your organisation's stored credentials. Provider keys are never accepted in the request body.
On a fallback-eligible provider error (provider_timeout, provider_unavailable,provider_rate_limit) the gateway may transparently retry once against a secondary model marked fallbackEligible in the catalog. The outcome is reflected inmeta.fallbackUsed.
Section
Streaming
Sending "stream": true returns a text/event-stream. Privian runs masking, the provider call, and rehydration to completion, then chunks the already-rehydrated text. Native provider token streaming is not exposed in the beta.