RabbitMQ

How It Works Internally

RabbitMQ is one of those tools that clicks fast. The mental model is simple: publishers send messages to exchanges, exchanges route them to queues based on rules, and consumers pull from queues. Messages never go directly to a queue. That indirection through exchanges is what makes the routing so flexible.

The protocol underneath is AMQP 0-9-1, and the routing works through three primitives: exchanges, bindings, and queues. A direct exchange routes by exact routing key match. A fanout exchange broadcasts to every bound queue, ignoring routing keys entirely. A topic exchange matches routing keys against wildcard patterns (* for one word, # for zero or more). A headers exchange matches on message header attributes instead of routing keys.

Internally, each classic queue is a single Erlang process backed by a message store on disk. Messages arrive, optionally get persisted (if delivery_mode=2), and sit in memory until consumed or paged to disk under memory pressure. The Erlang VM's lightweight process model lets RabbitMQ run thousands of queues concurrently. But each individual queue is single-threaded. One slow consumer on one queue becomes a bottleneck for that queue. Keep that in mind when designing the topology.

Quorum queues (introduced in 3.8) use the Raft consensus protocol. They replicate messages across a configurable number of nodes (typically 3 or 5), and a write only commits when a majority acknowledges. This is a big upgrade over classic mirrored queues, which used async replication and had known data loss windows during network partitions. For anything that matters, use quorum queues. Classic mirrored queues are deprecated since 3.13 for good reason.

Streams (introduced in 3.9) bring Kafka-like append-only log semantics into RabbitMQ. Messages get stored in segment files on disk with offset-based consumption, so they can be replayed and have multiple consumers reading at different positions. Streams hit 1M+ messages/sec throughput and work well for fan-out patterns where several consumers need the same data. That said, for serious stream processing with consumer groups and exactly-once semantics, Kafka is still the better tool. Streams are good enough for many use cases, not all of them.

Production Architecture

Run a minimum of 3 nodes in a cluster across availability zones. Use quorum queues for all critical workloads with a replication factor that matches the cluster size (3 or 5). Set vm_memory_high_watermark to 0.4 (40% of RAM) and disk_free_limit to at least 2x the RAM size to avoid triggering disk alarm blocking. Enable ha-promote-on-shutdown: always for graceful rolling upgrades.

On the publishing side, turn on publisher confirms (RabbitMQ's version of producer acks) for guaranteed delivery. On the consumer side, set prefetch_count between 10 and 50 per consumer. Going too high starves other consumers. Going too low wastes network round trips. In practice, prefetch=25 tends to be the sweet spot for most workloads, but benchmark the specific workload. Use lazy queues or quorum queues for bursty workloads where messages might pile up.

For monitoring, expose metrics via the management plugin or Prometheus exporter and track these: queue_messages_ready (the backlog), queue_messages_unacknowledged (in-flight work), channel_consumers (consumer count), memory usage relative to the watermark, and file descriptor usage. Set alerts when any queue's ready message count exceeds the SLA threshold. Without monitoring mem_alarm, expect a 3 AM surprise eventually.

Capacity Planning

A single RabbitMQ node handles 20,000 to 50,000 messages/sec for persistent messages with publisher confirms, or 100,000+ messages/sec for transient messages. Quorum queues add roughly 30% overhead compared to classic queues because of Raft replication. Streams blow past both at 1M+ messages/sec by bypassing the per-queue Erlang process bottleneck.

Memory math: each queued message eats about 750 bytes of overhead on top of the payload. A queue holding 1 million 1KB messages consumes roughly 1.7GB of memory. When memory exceeds the high watermark, the broker pages messages to disk and blocks all publishers. Every single one, across every vhost. This is a cliff, not a gradual degradation. It causes cascading failures. Set x-max-length on all queues and dead-letter the overflow. Do not skip this.

Erlang processes: each connection uses 2 processes, each channel uses 1, and each queue uses 1. The default process limit is 1,048,576. A deployment with 10,000 connections, 20,000 channels, and 5,000 queues consumes 45,000 processes. Well within limits, but worth monitoring in a multi-tenant setup.

Failure Scenarios

Scenario 1: Network partition splits a 3-node cluster into [2, 1]. With the default pause_minority partition handling, the minority node pauses and stops serving traffic. Quorum queues on the majority side keep working. With autoheal mode, the minority node auto-restarts after the partition heals, but any non-replicated messages on the minority are lost. Use pause_minority. It's the safer default. Detection: monitor the partitions metric and Erlang distribution port connectivity. Recovery: make sure quorum queues are running for anything critical.

Scenario 2: Memory watermark triggered, all publishers blocked. A consumer dies, messages pile up, and memory hits 40% of RAM. The broker fires the memory alarm and blocks every publisher across every vhost, including completely unrelated ones. This is the single most common RabbitMQ outage pattern I've seen. Detection: monitor mem_alarm and queue_messages_ready per queue. Recovery: restart the failed consumer, add more consumers, or purge the queue if the messages are expendable. Prevention: set per-queue x-max-length limits with dead-letter exchanges, and use separate vhosts with individual memory limits to isolate workloads. Do not run a shared RabbitMQ cluster without queue-level length and TTL policies. Regret is guaranteed.

How It Works Internally

Production Architecture

Capacity Planning

Failure Scenarios

Use Cases

Architecture

How It Works Internally

Production Architecture

Capacity Planning

Failure Scenarios

Pros

Cons

When to use

When NOT to use

Key Points

Common Mistakes

Related Technologies

RabbitMQ

Use Cases

Architecture

How It Works Internally

Production Architecture

Capacity Planning

Failure Scenarios

Pros

Cons

When to use

When NOT to use

Key Points

Common Mistakes

Related Technologies