The LLM Gateway & Router Index
A neutral index of the LLM gateways and model routers teams use to sit between their app and the model providers — grouped by what they actually do (aggregate, proxy, route, or govern), how they're hosted, and what each is best at. A gateway is the 'route and govern' layer of the AI stack: one API across many models, plus key management, caching, rate-limits, failover, cost tracking, and — for routers — automatic model selection per request. We describe each tool's type, hosting, and license rather than prices, which change fast. Pair this with the observability index — the gateway controls which model gets called, observability tells you how it did.
Last reviewed June 13, 2026 · 12 tools · neutral & vendor-independent
The matrix
| Tool | Type | Hosting | License | Best for |
|---|---|---|---|---|
| OpenRouter | Aggregator / unified API | Managed | Proprietary | One API and billing across hundreds of models, pay-per-use |
| LiteLLM | Proxy gateway | Both (self-host or cloud) | Open-source | Self-hosted, OpenAI-compatible gateway + spend tracking across 100+ providers |
| Portkey | Proxy gateway + router | Both | Open-source core | Production gateway with routing, caching, guardrails, and observability |
| Cloudflare AI Gateway | Proxy gateway (edge) | Managed | Proprietary | Edge caching, rate-limiting, and analytics across providers with zero infra |
| Helicone | Proxy gateway + observability | Both | Open-source | Drop-in proxy logging, caching, and routing with minimal code |
| Kong AI Gateway | API gateway | Both | Open-source | Teams already on Kong or needing enterprise API governance |
| Vercel AI Gateway | Proxy gateway | Managed | Proprietary | Vercel / Next.js apps — one endpoint with failover and observability |
| Martian | Smart router | Managed | Proprietary | Dynamic routing to the best or cheapest model per request |
| Not Diamond | Smart router | Managed | Proprietary | Routing each prompt to the best-performing model automatically |
| Unify | Router + aggregator | Managed | Proprietary | Benchmarking and routing across providers on quality/cost/latency |
| Eden AI | Aggregator / unified API | Managed | Proprietary | One API across many AI services, not just LLMs |
| Apache APISIX AI Gateway | API gateway | Both | Open-source | Open-source API gateway with AI proxy and governance plugins |
Type, hosting, and license — not version-specific feature claims, which go stale fast. Many gateways blend roles (a proxy that also routes, a router that also caches); we tag each by its centre of gravity. Verify current capabilities on each project's docs.
How to read the types
- Aggregator / unified API
- One API and key to reach many model providers — you still choose the model, but you stop integrating each provider separately.
- Proxy gateway
- Sits in front of your model calls to add caching, logging, key management, rate-limits, and failover with minimal code change.
- Smart router
- Automatically picks the best or cheapest model per request based on the prompt and your quality/cost targets.
- API gateway
- A general-purpose API gateway extended with AI/LLM plugins and governance — the choice when you already run one.
Which one should you pick?
- You want one API and billing across many modelsOpenRouter
- You want a self-hosted, OpenAI-compatible gatewayLiteLLM
- You want routing + caching + guardrails in productionPortkey
- You want edge caching/analytics with zero infraCloudflare AI Gateway
- You want the cheapest/best model picked per requestMartian or Not Diamond
- You already run Kong or need enterprise API governanceKong AI Gateway
- You're building on VercelVercel AI Gateway
- You want drop-in proxy logging that doubles as a gatewayHelicone
FAQs
What is an LLM gateway or model router, and why use one?
It's a layer that sits between your application and the model providers. A gateway gives you one API, key management, caching, rate-limits, failover, and cost tracking; a router additionally picks the best or cheapest model for each request. The payoff is control and decoupling: you can swap providers without rewriting code, cap costs, survive a provider outage, and stop scattering API keys across services.
What's the difference between an aggregator, a proxy gateway, and a router?
An aggregator gives you one API to many providers, but you still choose the model. A proxy gateway sits in front of your calls and adds caching, logging, key management, and failover with little code change. A smart router goes further and automatically selects the model per request based on cost and quality targets. Many tools blend these — for example a proxy gateway that also offers routing rules.
Which LLM gateway is the best?
It depends on your needs. For unified billing across many models, OpenRouter. For a self-hosted, OpenAI-compatible gateway, LiteLLM. For production routing, caching, and guardrails, Portkey. For edge caching with zero infrastructure, Cloudflare AI Gateway. For automatic cheapest/best-model routing, Martian or Not Diamond. For enterprise API governance, Kong. Teams on Vercel often default to the Vercel AI Gateway.
Do I need a gateway, or can I call providers directly?
For a prototype, calling providers directly is fine. A gateway earns its place once you use more than one model, need failover and caching, want central cost control, or expect to swap providers without code changes. The lowest-friction starting points are LiteLLM if you want to self-host, or OpenRouter if you want a managed unified API — both let you add a gateway without committing to a heavy platform.
How does a gateway fit with observability and the rest of the AI stack?
The gateway is the 'route and govern' layer: it decides which model is called and enforces caching, rate-limits, and cost rules. Observability and evaluation tools then trace and score what comes back — see the LLM Observability & Eval Index. Some gateways (Portkey, Helicone) include observability too, so the layers overlap; together with agent frameworks (build) and vector databases (store), they make up the operational AI stack.
