Managed RabbitMQ (Deferred)
RabbitMQ is evaluated but not on the current committed roadmap. This document captures the design direction for when the service is prioritized. See Portfolio Strategy for sequencing criteria.
Purpose: For platform engineers, documents the design direction for a managed RabbitMQ messaging service — capabilities under consideration, deployment model, and what would trigger roadmap inclusion.
Rationale for Deferral
Per the portfolio strategy, a data service enters the roadmap when it meets all selection criteria:
- ✅ Operator maturity — RabbitMQ Cluster Operator (VMware/Broadcom maintained)
- ⚠️ Customer demand — acknowledged but Kafka covers many messaging use cases
- ⚠️ Operational clarity — quorum queue operations, shovel/federation complexity
- ✅ Air-gap viability — standard container images
- ⚠️ Commercial clarity — revenue model not yet defined; overlap with Kafka offering
Design Direction (When Prioritized)
Capabilities Under Consideration
| Capability | Description |
|---|---|
| Quorum queues | Default queue type for durability and consistency |
| Classic queues | Available for compatibility with existing applications |
| Federation | Cross-cluster message forwarding |
| Shovel | Point-to-point message transfer between clusters |
| TLS | cert-manager issued certificates |
| Authentication | Internal user DB + LDAP/OAuth (via Keycloak) |
| Monitoring | Prometheus plugin + Grafana dashboards |
| GitOps lifecycle | RabbitmqCluster CRD in Git, FluxCD reconciliation |
| Management UI | RabbitMQ Management plugin (authenticated) |
Service Tiers (Draft)
| Tier | Topology | Queues | Use Case |
|---|---|---|---|
| Development | Single node | Classic | Local development, testing |
| Standard | 3-node cluster | Quorum (default) | Production messaging |
| Premium | 3+ nodes + federation | Quorum + federation | Multi-site messaging |
Operator
The RabbitMQ Cluster Operator (maintained by VMware/Broadcom) is the primary candidate:
RabbitmqClusterCRD for cluster lifecycle- Built-in Prometheus metrics
- TLS and user management
- Active maintenance and regular releases
No formal evaluation has been completed. Assessment will start when the service enters the roadmap.
Integration Points
| Integration | Mechanism |
|---|---|
| Platform observability | Prometheus plugin → kube-prometheus-stack |
| Security | NetworkPolicies + TLS + authentication |
| Backup | Velero PVC snapshots + definition export |
| GitOps | FluxCD reconciliation of RabbitmqCluster CRDs |
Kafka vs RabbitMQ
| Consideration | Kafka | RabbitMQ |
|---|---|---|
| Pattern | Event streaming, log-based | Message queuing, task distribution |
| Retention | Configurable, replay possible | Consumed = deleted (default) |
| Ordering | Per-partition guaranteed | Per-queue FIFO |
| Throughput | Millions/sec (append-only log) | Thousands/sec (broker routing) |
| Consumer model | Pull-based, consumer groups | Push-based, competing consumers |
If Kafka's streaming model fits, use Managed Kafka. RabbitMQ targets traditional request/reply and task-queue patterns where message acknowledgment and routing flexibility matter more than replay.
What Would Trigger Prioritization
- Customer workloads requiring traditional message queuing (not streaming)
- Applications with complex routing requirements (topic exchanges, headers routing)
- Migration requests from existing RabbitMQ deployments
- Clear commercial differentiation from Kafka offering
Further Reading
- Data Services Overview — family overview and principles
- Streaming Blueprint — Kafka alternative for streaming use cases
- Portfolio Strategy — selection criteria and roadmap