Skip to main content

OpenTelemetry Observability

KubeOpenCode integrates with OpenTelemetry to provide observability for AI agent workloads — LLM call traces, token usage, latency, and application-level spans flow automatically to your existing observability infrastructure.

OpenCode has built-in OTel support that is activated by injecting standard OTLP environment variables into agent Pods. KubeOpenCode handles the injection; OpenCode handles the telemetry production. No code changes or custom instrumentation required.

How It Works

OpenCode's OTel support has three complementary layers:

LayerWhat It DoesActivation
Layer 1: InfrastructureTracerProvider + ContextManager + OTLP ExporterOTEL_EXPORTER_OTLP_ENDPOINT env var
Layer 2: LLM TracesPer-LLM-call spans with model, tokens, latencyexperimental.openTelemetry: true in OpenCode config + Layer 1
Layer 3: App SpansSession/turn/lifecycle spansAutomatically active with Layer 1

Layer 1 is the prerequisite — it creates the TracerProvider and wires exporters. Layer 3 is automatically active with Layer 1. Layer 2 is an additional opt-in that adds fine-grained LLM call detail.

When you enable observability in KubeOpenCodeConfig, the controller:

  1. Injects OTEL_EXPORTER_OTLP_ENDPOINT into agent Pod specs (activates Layers 1 + 3)
  2. Optionally injects experimental.openTelemetry: true into the OpenCode config (activates Layer 2)
  3. Injects Kubernetes resource attributes for trace correlation
  4. Handles header authentication via inline values or Secret references

Configuration

Observability is configured in the KubeOpenCodeConfig cluster-scoped singleton (must be named cluster):

apiVersion: kubeopencode.io/v1alpha1
kind: KubeOpenCodeConfig
metadata:
name: cluster
spec:
observability:
openTelemetry:
enabled: true
# OTLP/HTTP endpoint for your Collector. Required when enabled.
# KubeOpenCode does NOT deploy or manage Collectors.
endpoint: "http://otel-collector.observability:4318"
# Optional headers for collector authentication
headers:
x-honeycomb-team:
valueFrom:
secretKeyRef:
name: honeycomb-credentials
key: api-key
# Enable LLM call traces (model, tokens, latency per call)
enableLLMTraces: true
# Record full prompt/response content on LLM spans
recordContent: false
# Additional resource attributes on all spans
resourceAttributes:
kubeopencode.cluster.name: "production"

When using Helm, configure it in values.yaml:

kubeopencodeConfig:
observability:
openTelemetry:
enabled: true
endpoint: "http://otel-collector.observability:4318"
enableLLMTraces: true

Field Reference

FieldTypeDescription
openTelemetry.enabledboolEnable OTel telemetry injection. Required.
openTelemetry.endpointstringOTLP/HTTP endpoint for your Collector. Required when enabled: true. Must be an HTTP or HTTPS URL.
openTelemetry.headersmapOptional headers for collector authentication. Values can be inline or from Secrets.
openTelemetry.enableLLMTracesboolInject experimental.openTelemetry into OpenCode config to enable per-LLM-call spans (Layer 2). Default: false.
openTelemetry.recordContentboolRecord full prompt/response content on LLM spans. Default: false. Only enable in trusted environments — may expose sensitive data.
openTelemetry.resourceAttributesmapAdditional resource attributes for all spans. The controller also injects standard attributes automatically.

Protocol Note

OpenCode uses OTLP/HTTP (port 4318), not gRPC (port 4317). The configured endpoint must point to an OTLP/HTTP receiver. If your Collector exposes only gRPC, enable the HTTP receiver in its configuration or front it with an HTTP-to-gRPC bridge.

What You Get

Layer 1 + 3 (Basic: enabled: true)

Setting enabled: true with an endpoint activates the OTel infrastructure and application-level spans:

  • Session/turn/lifecycle spansRunInteractive.session, RunInteractive.turn, RunLifecycle.boot/close, etc.
  • Log export — OpenCode logs flow to the same OTLP endpoint
  • Trace context — AsyncLocalStorageContextManager ensures correct parent-child span linkage

This gives you application-level visibility: when sessions start and end, how many turns occur, and runtime lifecycle events.

Layer 2 (LLM Detail: enableLLMTraces: true)

Setting enableLLMTraces: true additionally activates AI SDK telemetry for every LLM call. Observed span attributes:

SpanKey Attributes
ai.streamTextai.model.id, ai.model.provider, ai.usage.inputTokens/outputTokens/totalTokens, ai.usage.cachedInputTokens, ai.usage.reasoningTokens, ai.response.finishReason
ai.streamText.doStreamSub-span for the actual HTTP streaming execution
ai.toolCallai.toolCall.name, ai.toolCall.args, ai.toolCall.result

These spans nest within Layer 3's RunInteractive.turn span, producing a unified trace tree:

Config.get / Agent.state / ... ← Layer 3 (app-level spans)
└─ LLM.run ← Layer 3 (OpenCode LLM orchestration)
└─ ai.streamText ← Layer 2 (AI SDK streaming span)
├─ ai.model.id=big-pickle
├─ ai.usage.inputTokens=8247
├─ ai.usage.outputTokens=117
└─ ai.toolCall ← Layer 2 (AI SDK tool call)
├─ ai.toolCall.name=read
└─ ai.toolCall.args={"filePath":"/workspace/task.md"}

This gives you fine-grained LLM observability: which model was called, how many tokens were consumed, how long each call took, and what tools were invoked.

Note on attribute namespace: The Vercel AI SDK currently emits ai.* attributes rather than the OpenTelemetry gen_ai.* semantic conventions. The data is functionally equivalent for cost attribution, model tracking, and behavior analysis. When the AI SDK adds gen_ai.* support in a future version, those attributes will automatically appear without any KubeOpenCode changes.

Headers & Authentication

For Collectors that require authentication (SaaS backends like Honeycomb, Datadog, New Relic), configure headers with either inline values or Secret references:

Inline Value (Non-Sensitive)

headers:
x-custom-header:
value: "my-static-value"
headers:
x-honeycomb-team:
valueFrom:
secretKeyRef:
name: honeycomb-credentials
key: api-key

Create the Secret in the controller namespace:

kubectl create secret generic honeycomb-credentials \
--from-literal=api-key=your-api-key \
-n kubeopencode-system

When using secretKeyRef, the controller injects a helper environment variable that kubelet resolves from the Secret at container startup. The header value is never stored in the KubeOpenCodeConfig resource.

Resource Attributes

The controller automatically injects these resource attributes on all spans:

AttributeSource
kubeopencode.task.nameTask resource name (Task Pods only)
kubeopencode.task.namespaceTask/Agent namespace
kubeopencode.agent.nameAgent or AgentTemplate name
k8s.namespace.nameNamespace
k8s.pod.namePod name (Downward API for Deployments)

You can add custom attributes via resourceAttributes:

resourceAttributes:
kubeopencode.cluster.name: "production"
environment: "staging"

Record Content

By default, LLM spans do not include full prompt and response content. To enable content recording:

openTelemetry:
recordContent: true

This injects OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true per the OTel GenAI specification. Use with caution — recorded content may contain API keys, PII, proprietary code, or other sensitive data. Only enable in trusted environments with appropriate data handling policies.

Responsibility Boundary

KubeOpenCode produces standardized OTLP data and sends it to the user-configured endpoint. Everything beyond that is the user's responsibility:

What KubeOpenCode DoesWhat the User Does
Produces OTLP-compliant traces and logsDeploys and manages OTel Collectors
Injects env vars and config into agent PodsConfigures Collector processors, exporters, and backends
Sets K8s semantic convention resource attributesChooses backend (Jaeger, Tempo, Grafana, Datadog, etc.)
Provides the endpoint configuration fieldDecides Collector topology (DaemonSet, Gateway, Sidecar)

KubeOpenCode does not deploy Collectors, prescribe topology, or bundle vendor-specific exporters.

The following examples help integrate KubeOpenCode with your existing OTel Collector.

Basic Gateway Collector

receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318

exporters:
otlphttp:
endpoint: "http://jaeger-collector.observability:4318"

service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlphttp]

k8sattributes Processor

Recommended for auto-enriching spans with Kubernetes metadata:

processors:
k8sattributes:
auth_type: "serviceAccount"
passthrough: false
filter:
node_from_env_var: KUBE_NODE_NAME
extract:
metadata:
- k8s.pod.name
- k8s.namespace.name
- k8s.node.name
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.name

service:
pipelines:
traces:
receivers: [otlp]
processors: [k8sattributes]
exporters: [otlphttp]

spanmetrics Connector (with ai.* to gen_ai.* Mapping)

The OTel Collector's spanmetrics connector derives metrics from spans, but it looks for gen_ai.* attributes. Since the AI SDK emits ai.* attributes, use the transform processor to rename them:

processors:
transform/ai-to-genai:
error_mode: ignore
trace_statements:
- context: span
statements:
- set(attributes["gen_ai.system"], attributes["ai.model.provider"]) where attributes["ai.model.provider"] != nil
- set(attributes["gen_ai.request.model"], attributes["ai.model.id"]) where attributes["ai.model.id"] != nil
- set(attributes["gen_ai.usage.input_tokens"], attributes["ai.usage.inputTokens"]) where attributes["ai.usage.inputTokens"] != nil
- set(attributes["gen_ai.usage.output_tokens"], attributes["ai.usage.outputTokens"]) where attributes["ai.usage.outputTokens"] != nil

connectors:
spanmetrics:
histogram:
explicit:
buckets: [100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s, 30s]
dimensions:
- name: gen_ai.request.model
- name: gen_ai.system
- name: kubeopencode.agent.name
- name: kubeopencode.task.namespace
metrics_flush_interval: 15s

service:
pipelines:
traces:
receivers: [otlp]
processors: [transform/ai-to-genai]
exporters: [spanmetrics, otlphttp]
metrics:
receivers: [spanmetrics]
exporters: [prometheus]

Local Development with Jaeger

For local development and testing, deploy an OTel Collector + Jaeger to your Kind cluster:

# Load images into Kind
make load-otel-images

# Deploy OTel Collector + Jaeger
kubectl apply -f deploy/local-dev/otel-observability.yaml

# Enable observability in KubeOpenCodeConfig
kubectl patch kubeopencodeconfig cluster --type=merge -p \
'{"spec":{"observability":{"openTelemetry":{"enabled":true,"endpoint":"http://otel-collector.observability:4318","enableLLMTraces":true}}}}'

# Port-forward Jaeger UI
kubectl port-forward -n observability svc/jaeger-query 16686:16686

Open http://localhost:16686 to query traces. Select the opencode service to see LLM call spans.

Backend Compatibility

KubeOpenCode produces OTLP-compliant data. Any OTLP-capable backend can receive it through your Collector:

BackendNotes
JaegerNative OTLP support in v1.35+
Grafana TempoFirst-class OTLP; recommended for Kubernetes
SigNozNative OTLP support
HoneycombOTLP via x-honeycomb-team header
DatadogOTLP ingest endpoint available
Prometheus / MimirVia Collector's prometheusremotewrite exporter
Elastic, Splunk, New Relic, DynatraceAll support OTLP via vendor or standard exporters