Observability

KubeClaw ships a full observability stack out of the box: logs, metrics, traces, and Kubernetes events unified in ClickHouse, searchable through HyperDX, collected via OpenTelemetry. One backend instead of four.


How it works

The stack deploys as a ClickStack subchart and consists of:

  • ClickHouse for storage (logs, traces, metrics all in one database)
  • HyperDX as the search and visualization UI
  • OpenTelemetry collectors in two roles: a node-level DaemonSet for pod logs and host metrics, and a cluster-level Deployment for Kubernetes events and cluster metrics

The Gateway container is automatically injected with the OTEL environment variables so traces and logs flow without any application-level configuration.

Configuration

KeyDefaultDescription
observability.enabledtrueMaster toggle for the full observability stack
observability.gateway.enabledtrueInject OTEL env vars into the Gateway container
observability.gateway.serviceNameopenclaw-gatewayOTEL service name
observability.gateway.resourceAttributes""Additional OTEL resource attributes
observability.nodeCollector.enabledtrueDaemonSet for pod logs and host metrics
observability.clusterCollector.enabledtrueDeployment for K8s events and cluster metrics
observability.ingress.enabledtrueIngress for the HyperDX UI
observability.ingress.hosthyperdx.example.comHyperDX hostname

ClickStack subchart storage class defaults to "" (cluster default) via clickstack.global.storageClassName.

Disabling observability

To run without the observability stack, set the master toggle to false:

yaml
observability:
  enabled: false

This removes ClickHouse, HyperDX, and all collectors. The Gateway will still function normally, just without centralized telemetry.

Accessing HyperDX

When observability.ingress.enabled is true, HyperDX is exposed at the configured hostname. If you are using Gateway API routing, it is available at the gatewayAPI.routes.o11y path prefix (default /o11y).

For local access without ingress, port-forward to the HyperDX service:

shell
kubectl port-forward -n kubeclaw svc/kubeclaw-hyperdx 8080:8080

What gets collected

With default settings, the following telemetry flows into ClickHouse:

  • Traces from the Gateway process (OTEL auto-instrumentation via env vars)
  • Pod logs from all containers in the namespace (node collector DaemonSet)
  • Host metrics like CPU, memory, disk, and network (node collector)
  • Kubernetes events such as pod scheduling, restarts, and OOM kills (cluster collector)
  • Cluster metrics like node status, pod counts, and resource usage (cluster collector)

All data is queryable through HyperDX's search interface with full-text search, filtering, and correlation across signal types.

Storage

ClickHouse uses a PVC for data persistence. Configure the storage class and size through the ClickStack subchart values:

yaml
clickstack:
  global:
    storageClassName: "my-storage-class"
  clickhouse:
    persistence:
      size: 50Gi