Configuration reference

NetFoundry LLM Gateway is configured with a YAML file. CLI flags can override individual settings.

Gateway settings

Controls the listen address for the gateway process:

listen: ":8080"   # address to listen on (default: :8080)

To expose the gateway over a zrok overlay instead of a local port, add a top-level zrok: block:

zrok:
  share:
    enabled: false
    mode: private
    token: ""

Providers

Configure which inference providers the gateway can route to:

providers:
  open_ai:
    api_key: ${OPENAI_API_KEY}    # supports environment variable expansion

  anthropic:
    api_key: ${ANTHROPIC_API_KEY}

  local:
    base_url: http://localhost:11434

Virtual API keys

Restrict client access with named keys and per-key model permissions:

api_keys:
  enabled: true
  keys:
    - name: alice
      key: ${ALICE_KEY}
      allowed_models: ["gpt-*", "claude-*"]
    - name: bob
      key: ${BOB_KEY}
      allowed_models: ["llama*"]

See Virtual API keys for a full reference.

Routing

Enable semantic routing and define named routes:

routing:
  default_route: general
  semantic:
    enabled: true
    provider: local
    model: nomic-embed-text
    threshold: 0.75
    ambiguous_threshold: 0.5
  routes:
    - name: coding
      model: claude-haiku-4-5-20251001
      description: "code generation, debugging, and technical tasks"
      examples:
        - "write a python function to sort a list"

See Semantic routing for a full reference.

Metrics

Expose a Prometheus metrics endpoint:

metrics:
  enabled: true

Tracing

Enable request body logging for debugging routing decisions:

tracing:
  enabled: true
  max_content_length: 200   # max characters per message in log output

When enabled, each chat completion request is logged with the model, message count, streaming flag, tool count, and each message's role and truncated content.

Environment variables

String values support ${VAR_NAME} expansion. Variables are expanded at startup:

export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
llm-gateway run config.yaml

Complete example

A full configuration combining all sections:

listen: "0.0.0.0:8080"

zrok:
  share:
    enabled: true
    token: ${ZROK_SHARE_TOKEN}

api_keys:
  enabled: true
  keys:
    - name: primary
      key: ${PRIMARY_API_KEY}
      allowed_models: ["gpt-*", "claude-*", "llama*"]

providers:
  open_ai:
    api_key: ${OPENAI_API_KEY}

  anthropic:
    api_key: ${ANTHROPIC_API_KEY}

  local:
    base_url: http://localhost:11434

routing:
  default_route: general
  semantic:
    enabled: true
    provider: local
    model: nomic-embed-text
    threshold: 0.75

metrics:
  enabled: true

Run the gateway

Pass the config file path as the first argument:

llm-gateway run config.yaml

CLI flags

llm-gateway run <config-path> [flags]

Flags:
  --address string   Gateway listen address (e.g., 0.0.0.0:8080)
  --zrok             Enable zrok share (boolean)
  --zrok-mode string Zrok share mode (private or public)
  -h, --help         Show help

CLI flags take precedence over the config file.

Startup sequence

When the gateway starts, it:

Loads and parses the YAML config file.
Applies any CLI flag overrides.
Expands environment variables.
Initializes providers (OpenAI, Anthropic, local/self-hosted) in order.
Creates the model-to-provider router.
Initializes OpenTelemetry metrics (if enabled).
Initializes the semantic router (if configured).
Starts the HTTP server (local or via zrok share).

On shutdown (SIGINT/SIGTERM), the gateway closes all providers, deletes ephemeral zrok shares, and releases zrok access objects before exiting.

Gateway settings​

Providers​

Virtual API keys​

Routing​

Metrics​

Tracing​

Environment variables​

Complete example​

Run the gateway​

CLI flags​

Startup sequence​