Stack Overview

Purpose: For platform engineers, explains how the observability components work together, covering data flow from pods to dashboards.

Concept Summary

openCenter deploys a full observability stack through openCenter-gitops-base via FluxCD. The stack covers three signal types — metrics, logs, and traces — with a unified visualization layer. Each component handles one signal type, and OpenTelemetry acts as the collection and routing layer that ties them together.

Signal	Collector	Storage	Query
Metrics	Prometheus (scrape) + OpenTelemetry (OTLP)	Prometheus TSDB	PromQL via Grafana
Logs	Promtail / OpenTelemetry	Loki	LogQL via Grafana
Traces	OpenTelemetry Collector	Tempo	TraceQL via Grafana
Visualization	—	—	Grafana (all signals)

How It Works

Data Flow

Component Roles

Prometheus scrapes metrics from pods and services using ServiceMonitor and PodMonitor CRDs. It stores time-series data in its local TSDB and evaluates alerting/recording rules. Deployed as part of kube-prometheus-stack.

Alertmanager receives alerts from Prometheus, deduplicates them, groups related alerts, and routes them to notification channels (email, Slack, PagerDuty, webhooks). Also part of kube-prometheus-stack.

Grafana provides dashboards for all three signal types. It queries Prometheus (PromQL), Loki (LogQL), and Tempo (TraceQL) as data sources. Pre-configured dashboards are deployed via ConfigMaps. Also part of kube-prometheus-stack.

Loki stores and indexes log streams. It receives logs from Promtail (DaemonSet that tails container logs) or from the OpenTelemetry Collector. Loki indexes labels (namespace, pod, container) but stores log lines unindexed, keeping storage costs low.

Tempo stores distributed traces. It receives spans via OTLP from the OpenTelemetry Collector and stores them in an object store or local filesystem. Grafana queries Tempo to visualize trace timelines and service maps.

OpenTelemetry Collector acts as a vendor-neutral telemetry pipeline. It receives traces, metrics, and logs via OTLP, processes them (batching, filtering, enrichment), and exports to the appropriate backend (Prometheus, Loki, Tempo).

Trade-offs and Alternatives

Why kube-prometheus-stack instead of standalone Prometheus?

kube-prometheus-stack bundles Prometheus, Alertmanager, Grafana, and a set of recording rules and dashboards for Kubernetes. Deploying them together ensures consistent configuration and pre-built dashboards for node, pod, and cluster health.

Why Loki instead of Elasticsearch?

Loki's label-based indexing uses significantly less storage and memory than Elasticsearch's full-text indexing. For Kubernetes log aggregation where queries are typically filtered by namespace, pod, or container, Loki's approach is a better fit. The trade-off is that full-text search across log content is slower.

Why a separate OpenTelemetry Collector?

The Collector decouples instrumentation from backends. Applications send OTLP and the Collector routes to whatever backends are configured. If a backend changes (e.g., replacing Tempo with Jaeger), applications do not need to be reconfigured.

Common Misconceptions

"Prometheus collects logs and traces too." Prometheus handles metrics only. Logs go to Loki, traces go to Tempo. Grafana unifies the view across all three.

"OpenTelemetry replaces Prometheus." OpenTelemetry can forward metrics to Prometheus, but Prometheus still handles scraping, storage, alerting rules, and recording rules. They are complementary.

Concept Summary​

How It Works​

Data Flow​

Component Roles​

Trade-offs and Alternatives​

Common Misconceptions​

Further Reading​