`datadog` subscriber

Posts events to Datadog as logs (HTTP intake) and/or metrics (v2 series). Logs and metrics are independent — enable either, neither, or both.

Add to the subscribers block in config.yaml:

subscribers:
  datadog:
    enabled: true
    site: "datadoghq.com"      # datadoghq.eu, us3.datadoghq.com, us5.datadoghq.com, ap1.datadoghq.com, ddog-gov.com
    # API key via env var DD_API_KEY (preferred) or set api_key here
    api_key: ""
    compression: "gzip"        # "gzip" (default) or "none"
    skip_verify: false
    buffer_size: 1000

    logs:
      enabled: true
      source: "openziti"            # ddsource attribute
      service: "ziti-controller"    # service attribute; if empty, falls back to event namespace
      hostname: "ziti-prod-01"      # host attribute
      tags:                         # static ddtags appended to every log
        env: prod
        cluster: us-east-1
      namespace_filter: []          # only forward these namespaces; empty = all
      exclude_fields:               # drop fields before send (dotted paths supported)
        - circuitId
        - tags.sourceRouterId
      # Per-event filter — drop events before they're batched/billed.
      # See ../ "Per-Subscriber Filtering" for the full comparator list.
      # Common cases:
      include:
        - { field: event_type, not_equals: created }   # circuits: drop creates, keep failed/pathUpdated
      exclude: []
      batch_size: 100               # capped at 1000 (Datadog hard limit)
      flush_interval: 5s
      workers: 2

    metrics:
      enabled: false
      metric_prefix: "openziti."
      hostname: "ziti-prod-01"
      tags:
        env: prod
      batch_size: 100
      flush_interval: 10s
      workers: 1
      mappings:
        - namespace: metrics
          name_path: metric             # name_path's value supplies the middle of the metric name
          # values: field path → metric type (gauge | count | rate). For
          # nested paths only the leaf is used as the metric-name suffix,
          # so "metrics.mean" yields openziti.<metric>.mean.
          values:
            metrics.count: count
            metrics.mean: gauge
            metrics.m1_rate: rate
            metrics.p99: gauge
          tag_paths:                    # build tags from event fields
            router: source_id
            entity: source_entity_id
        - namespace: fabric.usage
          name_suffix: usage            # static suffix used when name_path is unset
          values:
            usage: count
          tag_paths:
            identity: identity_name
            service: service_name

The logs path posts a JSON array to https://http-intake.logs.<site>/api/v2/logs with the DD-API-KEY header. The enriched event JSON is inlined as nested attributes — Datadog auto-facets nested fields, so per-event ids, names, and timings stay searchable in Logs Explorer without inflating tag cardinality.

The metrics path posts to https://api.<site>/api/v2/series. For each event matching a mapping's namespace, every entry in values produces one series with metric name <metric_prefix><name_part>.<value_key>, where the name part is either the value at name_path (e.g., link.latency) or the static name_suffix. Non-numeric or missing values are skipped silently.

Cost and cardinality controls

These settings limit what gets indexed in Datadog and prevent tag cardinality from inflating your bill:

Use exclude_fields to drop sessionId, circuitId, or any per-flow id Datadog would otherwise index.
Keep dynamic ids in tag_paths only when the cardinality is bounded (router/identity counts in the hundreds — not flow/session ids).
Set namespace_filter to send only the namespaces the customer actually queries; high-volume fabric.usage is the obvious one to either filter out or send via metrics instead of logs.
Use include/exclude to filter within a namespace by event field — e.g. send only event_type: failed circuits, or only service_name values matching a regex. Drops happen before batching, so filtered events never count against your ingest bill. See the Per-subscriber filtering section of the configuration reference.
Both channels gzip the body by default; set compression: none only when debugging.

The subscriber retries on 429 and 5xx up to 3 times with exponential backoff and respects the Retry-After header. 4xx responses (other than 429) are not retried — those indicate a payload or auth problem the customer needs to fix.

Tags vs. attributes

Datadog treats logs and metrics differently. Knowing the model up front avoids both unusable dashboards and surprise bills.

Logs have two distinct dimension systems:

Tags (ddtags): Low-cardinality dimensions, appear in Live Tail and the index drop-down. The connector emits exactly: namespace:<event_namespace> plus everything you put in subscribers.datadog.logs.tags. Keep this small.
Attributes: Every nested JSON field in the event becomes searchable as @path.to.field. They don't cost anything by default; to use them in the left-side facet panel of Logs Explorer, promote them in Logs → Configuration → Facets.

Metrics have only tags. Each unique combination of tag values is a separate billed time series — every tag you add multiplies cardinality.

Recommended setup

Where	What to do	Why
Logs → Configuration → Facets	Promote `@service_name`, `@identity_name`, `@host_name`, `@edge_router_name`, `@event_type`, `@circuit_id`, `@metric`	These are how operators slice circuits/usage in the UI. Free; just makes the left panel usable.
Logs → Configuration → Indexes	Add a sampling/exclusion filter for `@namespace:metrics` if you also send metrics via the metrics API	Avoids paying twice for the same data.
Logs → Pipelines	Add a remapper so `@event_type` → log status, e.g. `*.failed` → `error`	Lets monitors fire on log severity.
Metrics → Tags	Inspect cardinality on `openziti.usage.` and `openziti..count`	Catches a tag explosion before the bill.

Tag cardinality reference

For the metrics path, cardinality depends on network size — these are rough orders of magnitude:

Field	Typical cardinality	Safe as a metric tag?
`env` / `cluster` / `region`	1–10	✅ always
`source_id` / `edge_router` (router id)	10s–100s	✅ usually
`service_name`	10s–1000s	⚠️ usually safe; review
`host_name`	10s–1000s	⚠️ usually safe; review
`identity_name`	100s–10000s+	⚠️ depends on network — often the dominant cost driver
`circuit_id` / `session_id` / per-flow ids	unbounded	🚫 never — these belong only on logs as attributes

The example config above intentionally puts identity_name into metrics.tag_paths for illustration. In a network with many identities you may want to drop it from metrics tags and rely on the logs attribute (@identity_name) for per-identity drill-down.

Cost and cardinality controls​

Tags vs. attributes​

Recommended setup​

Tag cardinality reference​

Cost and cardinality controls

Tags vs. attributes

Recommended setup

Tag cardinality reference