Connect via zrok
The gateway uses zrok in two independent ways:
- Sharing: Exposes the gateway over a zrok share so clients can reach it without a public IP or open ports.
- Accessing: Connects to backend providers through zrok shares instead of direct HTTP.
Both use zrok's overlay network built on OpenZiti.
Prerequisites
The gateway requires a zrok environment on the host machine. If zrok enable hasn't been run, the
gateway fails at startup:
zrok environment is not enabled; run 'zrok enable' first
This applies to both sharing and accessing.
Share the gateway
Instead of listening on a TCP port, the gateway can serve traffic through a zrok share. Clients connect to the share token rather than an IP address.
Ephemeral shares
An ephemeral share is created at startup and deleted when the gateway shuts down.
-
Add the zrok config to
config.yaml:zrok:share:enabled: truemode: private # or publicAlternatively, pass flags at runtime:
llm-gateway run config.yaml --zrok --zrok-mode private -
Start the gateway. The share token is logged at startup:
serving via zrok share 'abc123def456' -
Give clients the share token to connect.
Public mode creates a share accessible by anyone with the token. Private mode (the default) requires the client to have a zrok environment enabled and creates an access-controlled connection through the overlay.
Persistent shares
Ephemeral shares get a new token on every restart. For a stable token, create a persistent share with
zrok reserve and pass its token to the gateway:
zrok:
share:
enabled: true
token: "abc123" # existing persistent share token
Persistent shares are always private. The gateway connects to the existing share but doesn't delete it on shutdown — the share is managed externally.
Access providers via zrok
Any provider can be reached through a zrok share by setting zrok_share_token in its config. This is
useful when a provider runs on a different machine that isn't directly reachable over the network but
is connected to the same zrok environment:
providers:
local:
zrok_share_token: "remote-ollama-token"
anthropic:
api_key: "${ANTHROPIC_API_KEY}"
zrok_share_token: "anthropic-proxy-token"
Multi-endpoint
Each endpoint can independently use zrok or direct HTTP:
providers:
local:
endpoints:
- name: local
base_url: "http://localhost:11434"
- name: remote-gpu
zrok_share_token: "gpu-box-token"
Each endpoint with a zrok_share_token gets its own zrok access and HTTP client. The round-robin
load balancer uses whichever transport is configured per endpoint.