How to monitor RabbitMQ metrics using OpenTelemetry
Why monitor RabbitMQ at all
RabbitMQ is rarely the thing you set out to monitor — but it’s almost always in the path. A queue sits between the service that publishes work and the service that consumes it. When the queue backs up, when consumers drop to zero, or when the node hits a memory or disk alarm, every integration downstream stalls quietly. The publishing service still returns 200s. The consuming service looks idle. And messages pile up in between.
That’s exactly the kind of failure that hides from service-by-service monitoring, which is why the broker deserves first-class telemetry.
Why OpenTelemetry — not a proprietary agent
The traditional way to monitor RabbitMQ is to install a vendor’s agent that knows how to talk to RabbitMQ and ships those metrics into that vendor’s cloud, in that vendor’s format. It works. But it comes with strings: a proprietary agent to install and maintain, metrics locked into one backend, and — because every other system needs its own agent — a patchwork of tools that don’t share a data model.
OpenTelemetry takes the opposite approach. The OpenTelemetry Collector has a RabbitMQ receiver built in. You configure one standard pipeline, and because OTLP is an open standard implemented across many tools and backends, the same configuration and the same data work with whatever you point it at. Switch backends without re-instrumenting. Self-host the Collector so your telemetry never leaves your environment. Use the same Collector for RabbitMQ, your database, your services, and everything else — one data model, not ten.
In short: the proprietary agent monitors RabbitMQ for one tool. The OpenTelemetry receiver monitors it for any tool that speaks OTLP — which is most of them now.
Prerequisites
The RabbitMQ receiver reads from the RabbitMQ Management Plugin, so enable it and create a monitoring user:
rabbitmq-plugins enable rabbitmq_management
# create a least-privilege monitoring user (monitoring tag)
rabbitmqctl add_user otel_monitor 'a-strong-password'
rabbitmqctl set_user_tags otel_monitor monitoring
You’ll also need an OpenTelemetry Collector build that includes contrib receivers (e.g. otelcol-contrib).
The Collector configuration
receivers:
rabbitmq:
endpoint: http://localhost:15672 # Management Plugin endpoint
username: otel_monitor
password: ${env:RABBITMQ_MONITORING_PASSWORD}
collection_interval: 60s
exporters:
otlphttp/sluicio:
endpoint: your-tenant-ingest.sluicio.com:4318 # your Sluicio ingest endpoint
headers:
authorization: "Bearer ${env:SLUICIO_INGEST_TOKEN}"
service:
pipelines:
metrics:
receivers: [rabbitmq]
exporters: [otlphttp/sluicio]
That’s the whole pipeline: scrape the Management API every 10 seconds, export over OTLP. Point the exporter at any OTLP-compatible backend.
The RabbitMQ receiver is currently a beta component, so the exact metric set and field names evolve — check the receiver’s
documentation.md/metadata.yamlin collector-contrib for the current list before you build dashboards on specific names.
What to actually watch — and the exact metric
Every queue-level metric is tagged with the resource attributes rabbitmq.queue.name, rabbitmq.node.name, and rabbitmq.vhost.name, so you can break any of them down per queue. These queue-level metrics are enabled by default:
| What you’re watching | Metric | How to read it |
|---|---|---|
| Backlog / queue depth | rabbitmq.message.current where state=ready |
Messages waiting for a consumer. A steady climb means consumers can’t keep up. |
| In-flight / stuck work | rabbitmq.message.current where state=unacknowledged |
Delivered but not yet acked. A rising unacked count usually means a consumer is stuck mid-processing. |
| Consumer presence | rabbitmq.consumer.count |
Consumers currently reading the queue. Zero on a queue that should have consumers is a silent outage — messages arrive and nobody drains them. |
| Throughput in | rabbitmq.message.published |
Monotonic counter — take the rate. How fast messages are arriving. |
| Throughput out | rabbitmq.message.delivered and rabbitmq.message.acknowledged |
Rates of delivery and acknowledgement. If the published rate outpaces these for any sustained period, you’re falling behind. |
| Unroutable messages | rabbitmq.message.dropped |
Messages dropped as unroutable — almost always a binding / routing-key problem. |
A key detail: rabbitmq.message.current carries a state attribute with values ready and unacknowledged. So “queue depth” and “unacked work” aren’t two separate metrics — they’re the same metric filtered by state. Don’t go hunting for a second metric name.
Node health is disabled by default — opt in
The node-level metrics ship disabled, and that includes the alarm signals that matter most. You have to turn them on explicitly:
receivers:
rabbitmq:
endpoint: http://localhost:15672
username: otel_monitor
password: ${env:RABBITMQ_MONITORING_PASSWORD}
collection_interval: 60s
metrics:
rabbitmq.node.mem_alarm:
enabled: true
rabbitmq.node.disk_free_alarm:
enabled: true
rabbitmq.node.fd_used:
enabled: true
rabbitmq.node.sockets_used:
enabled: true
| What you’re watching | Metric | How to read it |
|---|---|---|
| Memory alarm | rabbitmq.node.mem_alarm |
Trips when RabbitMQ blocks publishers due to memory pressure — a direct “broker in trouble” signal. |
| Disk alarm | rabbitmq.node.disk_free_alarm |
Trips when free disk drops below the limit and publishers are blocked. |
| File descriptors | rabbitmq.node.fd_used vs rabbitmq.node.fd_total |
Exhausting file descriptors blocks new connections. |
| Sockets | rabbitmq.node.sockets_used vs rabbitmq.node.sockets_total |
Same story for socket exhaustion. |
The single most useful alert for most estates: rabbitmq.consumer.count == 0 on a queue that should always have a consumer. It catches the silent outage that service-level monitoring never sees.
For even higher-resolution message-rate metrics, you can additionally enable the rabbitmq_prometheus plugin and scrape it with the Collector’s prometheusreceiver — still fully OpenTelemetry-native, just a second receiver in the same pipeline.
The point
Monitoring RabbitMQ is table stakes. Doing it through OpenTelemetry means you do it once, in a portable, vendor-neutral way, and the broker’s health lands in the same place as the rest of your integration estate — so you can finally see the queue as part of the flow, not as an isolated box.