Article · LLM Gateway
What is an LLM gateway?
A clear definition of an LLM gateway, why teams put one in front of providers, and the responsibilities it should own.
7 min read · Updated May 20, 2026
Definition
An LLM gateway is an HTTP service that sits between your application and one or more model providers. It terminates client requests, applies policy, forwards to the right provider, and returns the response. Think of it as the same chokepoint pattern you already use for databases, payment providers, or any other third-party dependency.
Why teams introduce one
A gateway shows up in a team's stack when one of these problems gets uncomfortable:
- Provider keys are scattered across services and clients
- Every team is re-implementing the same retry, timeout, and rate-limit logic
- Nobody can answer "what data did we send to OpenAI last week?"
- Switching providers requires changing every caller
- PII masking has to be applied somewhere consistent
Each of these is solvable inside individual services. A gateway gives you one place to solve all of them at once.
What an LLM gateway typically owns
- Authentication. Gateway API keys instead of shared provider keys.
- Routing. A model ID like
openai/gpt-5.5resolves to the right provider with the right credentials. - Policy. Masking, content limits, allowed models, per-key quotas.
- Reliability. Timeouts, retries, circuit breakers around upstream calls.
- Observability. A single place to measure latency, errors, masked-entity counts and cost.
How a request flows
Your app
│ POST /v1/gateway { model, prompt, stream }
│ Authorization: Bearer <gateway api key>
▼
Gateway
├── authenticate the API key
├── validate the request (size, schema)
├── mask PII in the prompt
├── resolve provider from the model id
├── forward to the upstream provider
├── rehydrate the response
└── record metadata
│
▼
Your app
(response with original values restored)Tradeoffs
- Extra hop. The gateway adds a small amount of latency. For most workloads it is dominated by the model itself.
- Coupling to the gateway's API. Picking a gateway is picking an interface. Choose one whose contract you are willing to depend on.
- Capability lag. A gateway might trail provider-native features (tools, multimodal, streaming) until it catches up.
How Privian fits
Privian is an LLM gateway focused on prompt privacy. It exposes a single endpoint, accepts a simple JSON body, and applies masking and rehydration on every call. See LLM Gateway for the product overview and Architecture for the component-by-component detail.
Try Privian during beta
Protect prompts before they reach GPT, Claude and other models.
BYOK · Zero retention · Provider-agnostic. Privian is currently in beta — pricing and limits may change.
FAQ
Frequently asked questions
- Is an LLM gateway the same as a proxy?
- A proxy forwards requests. A gateway adds policy — authentication, masking, routing, rate limits, observability. Every gateway is a proxy; not every proxy is a gateway.
- What does Privian's gateway actually do today?
- It exposes POST /v1/gateway, authenticates with a Privian API key, masks recognized PII in the prompt, forwards to the selected provider/model, rehydrates the response, and records request metadata (not bodies).
- Do I lose features by going through a gateway?
- Privian's beta accepts a simple { model, prompt, stream } JSON body. It does not yet implement OpenAI's messages[] schema or provider-native tool/function calling. If you need those today, the gateway is not a drop-in replacement.
- Does the gateway support multiple providers?
- Yes — model IDs use a provider/id format such as openai/gpt-5.5 or anthropic/claude-sonnet-4.5. The gateway resolves the provider from the model and routes accordingly.
More articles
Continue reading
LLM Gateway
LLM gateway vs. AI gateway
The terms get used interchangeably. They are not the same. Here is the distinction we use and why it matters when you pick one.
LLM Gateway
Privacy-first LLM gateways, explained
Not all gateways treat data the same way. What makes a gateway privacy-first, and what to look for if data minimization is a requirement.
LLM Gateway
How to route prompts securely
Patterns for routing prompts across providers and models without leaking customer data or hardcoding provider keys into clients.