Tracing (Tempo)
Purpose: For platform engineers, shows how to configure Tempo storage, sampling strategies, and TraceQL examples.
Task Summary
Tempo stores distributed traces received from the OpenTelemetry Collector. Grafana queries Tempo to display trace timelines, service dependency maps, and span details. This guide covers trace ingestion, storage, querying, and sampling configuration.
Prerequisites
- Tempo deployed via FluxCD from
openCenter-gitops-base - OpenTelemetry Collector configured to export traces to Tempo
- Grafana configured with Tempo as a data source (default in openCenter)
Trace Ingestion
Tempo receives traces via OTLP from the OpenTelemetry Collector. The Collector is configured to export to Tempo's gRPC endpoint:
# OpenTelemetry Collector exporter configuration
exporters:
otlp/tempo:
endpoint: tempo.monitoring.svc.cluster.local:4317
tls:
insecure: true # Within cluster network
Applications send traces to the OpenTelemetry Collector (not directly to Tempo). See OpenTelemetry for Collector configuration.
Query Traces with TraceQL
Access Grafana's Explore view and select the Tempo data source.
Find traces by service name
{resource.service.name = "my-app"}
Find traces with errors
{status = error}
Find slow spans
{duration > 500ms}
Combine filters
{resource.service.name = "my-app" && span.http.status_code >= 500 && duration > 1s}
Search by trace ID
Paste a trace ID directly into the Grafana Explore search bar to view the full trace timeline.
Sampling Strategies
Not all traces need to be stored. Sampling reduces storage costs while preserving visibility into errors and slow requests.
Configure sampling in the OpenTelemetry Collector's processor pipeline:
processors:
tail_sampling:
decision_wait: 10s
policies:
- name: errors
type: status_code
status_code:
status_codes: [ERROR]
- name: slow-requests
type: latency
latency:
threshold_ms: 1000
- name: probabilistic
type: probabilistic
probabilistic:
sampling_percentage: 10
This configuration keeps all error traces, all traces slower than 1 second, and 10% of remaining traces.
Storage Configuration
Tempo stores traces in a backend configured in the HelmRelease values:
# applications/overlays/<cluster>/services/tempo/override-values.yaml
tempo:
storage:
trace:
backend: local
local:
path: /var/tempo/traces
wal:
path: /var/tempo/wal
persistence:
enabled: true
storageClassName: longhorn
size: 50Gi
For larger clusters, use S3-compatible storage:
tempo:
storage:
trace:
backend: s3
s3:
bucket: tempo-traces
endpoint: minio.storage.svc.cluster.local:9000
insecure: true
Verification
# Check Tempo pods are running
kubectl get pods -n monitoring -l app.kubernetes.io/name=tempo
# Verify Tempo is receiving traces
kubectl port-forward svc/tempo -n monitoring 3200:3200
curl -s http://localhost:3200/ready
# Should return "ready"
# Check trace count via Tempo API
curl -s http://localhost:3200/api/search?limit=5 | jq .
Troubleshooting
No traces appearing in Grafana: Verify the OpenTelemetry Collector is exporting to Tempo. Check Collector logs:
kubectl logs -n monitoring -l app.kubernetes.io/name=opentelemetry-collector --tail=20
"too many traces" / high storage usage:
Enable tail sampling in the OpenTelemetry Collector (see Sampling Strategies above). Reduce retention in Tempo's compactor config.
Further Reading
- Stack Overview — how Tempo fits into the observability stack
- OpenTelemetry — Collector pipeline configuration
- Dashboards & Alerts — trace-linked dashboards in Grafana