ngrok-powered model routing

One gateway for every AI model.

Route AI requests through a single ngrok endpoint to providers such as OpenAI, Anthropic, Google, and self-hosted models with provider failover, SDK compatibility, gateway keys, and visibility into the traffic moving between your app and the models it depends on.

View SDK Example See Request Flow Open Public Site

Why put a gateway in front of AI?

AI apps move faster when provider details, auth, routing, and traffic inspection live at the edge instead of being hard-coded into every client.

One endpoint

Point compatible SDKs at a single gateway URL and route requests without rewriting every integration.

Automatic failover

Retry across configured models, providers, or keys when a selected route cannot complete the request.

Traffic visibility

Inspect, secure, and observe AI traffic before it reaches cloud providers or local inference servers.

Request flow

The gateway validates the request, chooses a model/provider path, forwards the call, and returns the response to the application.

Step 1
App request

Your app sends an SDK or HTTP request to the gateway endpoint.
Step 2
Gateway auth

ngrok validates the AI Gateway API key before routing traffic.
Step 3
Route selection

The gateway selects a model, provider, or failover chain.
Step 4
Provider call

The request is forwarded to a cloud provider or self-hosted model.
Step 5
Response

The model output returns through the gateway to your app.

SDK-compatible by design

Keep your existing SDK shape. Change the base URL, use an AI Gateway API key, and let the gateway handle routing behind the scenes.

openai_gateway.py

from openai import OpenAI

client = OpenAI(
    base_url="https://your-ai-gateway.ngrok.app/v1",
    api_key="ng-xxxxx-g1-xxxxx"
)

response = client.chat.completions.create(
    model="ngrok/auto",
    messages=[
        {"role": "user", "content": "Hello from the gateway"}
    ]
)

print(response.choices[0].message.content)

Gateway capabilities

Use the gateway as a control point for AI traffic across managed providers, bring-your-own keys, and local inference.

SDK compatibility

Works with popular AI SDK patterns by changing the base URL.

Automatic selection

Use routes such as model auto-selection for flexible request handling.

Managed keys

Use gateway keys without exposing provider credentials to clients.

Bring your own keys

Connect additional providers and account-specific access.

Self-hosted models

Route to local inference runtimes alongside cloud providers.

Access control

Restrict which clients can reach specific gateway routes.

Content controls

Apply request or response modification policies at the gateway.

Observability

Inspect AI request behavior, latency, headers, and routing outcomes.

live demo

Try the public gateway page.

This static demo is published on Cloudflare Pages and mirrored locally on the lab network for quick iteration.

Open Demo GitHub Contact

One gateway for every AI model.

Why put a gateway in front of AI?

One endpoint

Automatic failover

Traffic visibility

Request flow

App request

Gateway auth

Route selection

Provider call

Response

SDK-compatible by design

Gateway capabilities

Try the public gateway page.