Stack Overview
Purpose: For platform engineers, explains how the observability components work together, covering data flow from pods to dashboards.
Concept Summary
openCenter deploys a full observability stack through openCenter-gitops-base via FluxCD. The stack covers three signal types — metrics, logs, and traces — with a unified visualization layer. Each component handles one signal type, and OpenTelemetry acts as the collection and routing layer that ties them together.
| Signal | Collector | Storage | Query |
|---|---|---|---|
| Metrics | Prometheus (scrape) + OpenTelemetry (OTLP) | Prometheus TSDB | PromQL via Grafana |
| Logs | Promtail / OpenTelemetry | Loki | LogQL via Grafana |
| Traces | OpenTelemetry Collector | Tempo | TraceQL via Grafana |
| Visualization | — | — | Grafana (all signals) |
How It Works
Data Flow
Component Roles
Prometheus scrapes metrics from pods and services using ServiceMonitor and PodMonitor CRDs. It stores time-series data in its local TSDB and evaluates alerting/recording rules. Deployed as part of kube-prometheus-stack.
Alertmanager receives alerts from Prometheus, deduplicates them, groups related alerts, and routes them to notification channels (email, Slack, PagerDuty, webhooks). Also part of kube-prometheus-stack.
Grafana provides dashboards for all three signal types. It queries Prometheus (PromQL), Loki (LogQL), and Tempo (TraceQL) as data sources. Pre-configured dashboards are deployed via ConfigMaps. Also part of kube-prometheus-stack.
Loki stores and indexes log streams. It receives logs from Promtail (DaemonSet that tails container logs) or from the OpenTelemetry Collector. Loki indexes labels (namespace, pod, container) but stores log lines unindexed, keeping storage costs low.
Tempo stores distributed traces. It receives spans via OTLP from the OpenTelemetry Collector and stores them in an object store or local filesystem. Grafana queries Tempo to visualize trace timelines and service maps.
OpenTelemetry Collector acts as a vendor-neutral telemetry pipeline. It receives traces, metrics, and logs via OTLP, processes them (batching, filtering, enrichment), and exports to the appropriate backend (Prometheus, Loki, Tempo).
Trade-offs and Alternatives
Why kube-prometheus-stack instead of standalone Prometheus?
kube-prometheus-stack bundles Prometheus, Alertmanager, Grafana, and a set of recording rules and dashboards for Kubernetes. Deploying them together ensures consistent configuration and pre-built dashboards for node, pod, and cluster health.
Why Loki instead of Elasticsearch?
Loki's label-based indexing uses significantly less storage and memory than Elasticsearch's full-text indexing. For Kubernetes log aggregation where queries are typically filtered by namespace, pod, or container, Loki's approach is a better fit. The trade-off is that full-text search across log content is slower.
Why a separate OpenTelemetry Collector?
The Collector decouples instrumentation from backends. Applications send OTLP and the Collector routes to whatever backends are configured. If a backend changes (e.g., replacing Tempo with Jaeger), applications do not need to be reconfigured.
Common Misconceptions
"Prometheus collects logs and traces too." Prometheus handles metrics only. Logs go to Loki, traces go to Tempo. Grafana unifies the view across all three.
"OpenTelemetry replaces Prometheus." OpenTelemetry can forward metrics to Prometheus, but Prometheus still handles scraping, storage, alerting rules, and recording rules. They are complementary.
Further Reading
- Metrics (Prometheus) — ServiceMonitor configuration and alerting rules
- Logging (Loki) — LogQL queries and retention
- Tracing (Tempo) — trace storage and TraceQL
- Telemetry (OpenTelemetry) — Collector pipelines and auto-instrumentation
- Dashboards & Alerts — Grafana dashboards and Alertmanager routing