On this page
What a message broker does (and why you need one)
The three messaging paradigms
RabbitMQ
Kafka
A critical distinction: Kafka vs Kafka Streams
ActiveMQ
Side-by-side comparison
Decision framework: which broker when?
How this comes up in system design interviews
Common interview follow-ups
Putting it all together
Keep learning
RabbitMQ vs Kafka vs ActiveMQ: A Complete Guide to Choosing the Right Message Broker

On This Page
What a message broker does (and why you need one)
The three messaging paradigms
RabbitMQ
Kafka
A critical distinction: Kafka vs Kafka Streams
ActiveMQ
Side-by-side comparison
Decision framework: which broker when?
How this comes up in system design interviews
Common interview follow-ups
Putting it all together
Keep learning
Ask five engineers to pick between Kafka, RabbitMQ, and ActiveMQ and you'll get seven opinions. Ask most of them why, and the answers get vague fast.
The reason this is hard isn't that the tools are complicated — it's that they solve subtly different problems, and most explanations treat them as interchangeable. They're not. Kafka is brilliant for one class of problem, RabbitMQ is brilliant for a different one, and ActiveMQ lives in its own corner of the enterprise world. Choosing wrong means you fight your infrastructure for years.
This guide walks through all three the way I'd explain them to a colleague over coffee. By the end, you'll know:
- What a message broker actually does — and why you need one as soon as you have more than one service
- The three messaging paradigms (point-to-point, pub/sub, log-based) and which one each broker implements
- What Kafka, RabbitMQ, and ActiveMQ each get right, what they get wrong, and which problems each one is built for
- A decision framework you can use in interviews or when you're actually picking a broker at work
- How to answer the canonical system design interview question: "which message broker would you use here?"
Let's dig in.
What a message broker does (and why you need one)
Here's the problem. You have Service A and Service B. A wants to tell B something — "user signed up," "payment completed," "order shipped." The naive way is for A to call B directly over HTTP and wait for a response.
This works until it doesn't. What happens when:
- B is down? A's request fails, and now A has to handle the retry logic.
- B is slow? A is blocked waiting, tying up its own resources.
- There are suddenly ten services that care about the "user signed up" event? A has to know about all ten and call them in order.
- A is producing events faster than B can process them? Requests pile up, things time out, cascading failures follow.
A message broker sits in the middle and absorbs all of that. A writes an event to the broker and moves on. The broker holds onto it, queues it, delivers it to whoever cares — one consumer, ten consumers, doesn't matter. If B is down, the broker waits. If B is slow, the broker buffers. If A is fast, the broker queues up the backlog.
This pattern is called asynchronous messaging, and it's the foundation of virtually every high-scale distributed system. If you want the deep dive on the async vs sync choice, see synchronous vs asynchronous communication. For a broader tour of messaging patterns, messaging patterns in system design is the pillar post.
The three brokers in this guide all solve that core problem. They differ in how they do it, which determines what each one is actually good at.
The three messaging paradigms
Before naming the brokers, it helps to name the models. There are three conceptual approaches to moving messages between services, and understanding them makes the broker comparison make sense.
1. Point-to-point queues. One producer writes to a queue. One consumer reads from it. Each message is consumed exactly once. Think of it as a work queue — jobs go in, a worker picks them up and does them. RabbitMQ's core model, ActiveMQ's JMS queues, AWS SQS — all point-to-point.
2. Publish-subscribe (pub/sub). One producer publishes to a topic. Many subscribers read from it. Each subscriber gets their own copy of every message. Think of it as broadcast — one event, many listeners, each doing their own thing. RabbitMQ's exchanges, ActiveMQ's topics, and the pattern Kafka implements at scale. If you want the concept explained in isolation, Observer vs Pub-Sub pattern is the right next read.
3. Log-based / streaming. One producer appends to an ordered, durable log. Many consumers read from it, each tracking their own position. Messages aren't "consumed and deleted" — they sit in the log for a configured retention period (days, weeks, forever). Consumers can replay from any point. This is Kafka's native model, and it's a fundamentally different way of thinking about messaging.
The first two — queue and pub/sub — are the classic models from the 1990s and 2000s. The third — log-based — is what Kafka popularized in the 2010s, and it changed how the industry thinks about event-driven architecture.
With that framing, the three brokers become easier to place.
RabbitMQ
RabbitMQ is an open-source message broker that implements the AMQP protocol (Advanced Message Queuing Protocol). It's written in Erlang — which sounds weird until you learn Erlang was designed for telecom switches that can't ever go down, which is exactly the reliability profile RabbitMQ inherits.
Core model: Producer → Exchange → Queue → Consumer. The exchange is RabbitMQ's secret weapon — it's a routing layer that decides which queue a message goes into based on flexible rules (direct routing, topic patterns, fanout, headers). This makes RabbitMQ incredibly versatile for complex routing scenarios.
Strengths:
- Flexible routing. Need to send "order created" events to the fulfillment queue, the analytics queue, and the customer-notification queue — but only send the notification if it's a premium customer? RabbitMQ's exchange types handle that cleanly.
- Mature, battle-tested. Almost 20 years old. Stable. Well-documented. Big community.
- Multi-protocol. Native AMQP, plus plugins for MQTT, STOMP, HTTP.
- Low latency for moderate volume. Single-digit milliseconds per message if you're not pushing absurd throughput.
- Per-message acknowledgment and priority. You can flag individual messages as high priority and they'll jump the queue.
Weaknesses:
- Throughput ceiling. RabbitMQ comfortably handles tens of thousands of messages per second per node. Kafka handles millions. When you need to move massive volumes of data, RabbitMQ hits its limit.
- Message retention is short-term. Queues are designed for messages to be consumed and removed, not retained for days. You can persist messages, but RabbitMQ is not meant to be your event log.
- Replay is awkward. Once a message is consumed, it's gone. If a downstream system missed an event, you can't replay it from RabbitMQ the way you can with Kafka.
- Clustering is operationally involved. Running RabbitMQ in high-availability mode requires understanding quorum queues, network partitions in Erlang, and split-brain scenarios.
Best use cases:
- Task queues for background jobs (image processing, email sending, report generation)
- RPC-over-messaging (request/reply patterns with correlation IDs)
- Complex routing scenarios where messages need to be filtered, transformed, or fanned out based on content
- Microservices communication where latency matters more than throughput
- Priority-based work distribution
In an interview: "For the background job system in our design, I'd use RabbitMQ. The jobs are a task queue pattern, we need per-message priority (paid-tier users jump the queue), and the throughput — tens of thousands per second — is well within RabbitMQ's sweet spot."
Kafka
Apache Kafka is a distributed streaming platform. LinkedIn built it in 2011 because their existing messaging infrastructure couldn't handle the volume of activity data they were producing. Now it's the default choice for high-volume event streaming at companies like Netflix, Uber, and virtually every FAANG.
Core model: Producers append messages to topics, which are partitioned for parallelism. Messages in a partition are ordered and persistent — they're kept for a configured retention period (7 days by default, but often weeks or months). Consumers form consumer groups and each group independently tracks its read position (offset) in each partition. Messages are never deleted by being consumed — they're deleted when they age out.
This is a radically different mental model from RabbitMQ. Kafka is not a message queue. It's a distributed, replicated, append-only log that happens to be useful for messaging.
Strengths:
- Throughput. Kafka handles millions of messages per second per cluster. Nothing else in this comparison comes close.
- Durability and replay. Every message is written to disk and replicated across brokers. Consumers can rewind to any point in time within the retention window. This is huge for debugging, replaying bug fixes over old data, or bootstrapping new services from historical events.
- Horizontal scalability. Add brokers to handle more throughput. Add partitions to parallelize consumption. The architecture scales linearly to truly enormous sizes.
- Ecosystem. Kafka Connect (for database integrations), Kafka Streams (for stream processing), Schema Registry, KSQL. An entire data-engineering ecosystem grew up around Kafka.
- Event sourcing friendly. Because events are retained and replayable, Kafka is the natural fit for event-sourced architectures.
Weaknesses:
- Operational complexity. Running Kafka in production requires understanding ZooKeeper (or KRaft, the newer replacement), partition assignment, consumer group rebalancing, replication factors, and failure modes. Many teams outsource this to managed services (Confluent Cloud, AWS MSK, Azure Event Hubs) specifically because self-hosting Kafka is not a weekend project.
- Overkill for small problems. If you need to send a few thousand messages per hour between two services, Kafka is architectural bloat.
- No per-message priority. Kafka delivers messages in partition order. You can't flag one message as more urgent than another.
- Latency is higher than RabbitMQ for low-volume workloads. Kafka is optimized for throughput, not for minimum per-message latency.
- Message routing is primitive. All consumers subscribe to a topic; there's no sophisticated filtering or content-based routing inside the broker. Filtering happens in consumers.
A critical distinction: Kafka vs Kafka Streams
This confuses a lot of candidates in interviews. Kafka is the broker — it stores and delivers messages. Kafka Streams is a client library for transforming, aggregating, and joining streams of Kafka messages — a stream processing engine, not a broker. They're different things that happen to share a name.
If the interview question is "how do I move events between services at scale?" — you want Kafka (the broker). If the question is "how do I compute a rolling 5-minute average of events flowing through my system?" — you want Kafka Streams (or alternatives like Apache Flink or Apache Storm). For the deep dive on that comparison, see Kafka Streams vs Apache Flink vs Apache Storm.
This post is about Kafka as a broker. Anything stream-processing-specific belongs in that other post.
Best use cases:
- High-volume event streaming (user activity, clickstreams, sensor data, application logs)
- Event sourcing architectures where the log of events is the source of truth
- Change data capture pipelines — a database's changes stream into Kafka for downstream consumers
- Analytics pipelines and real-time dashboards
- Asynchronous communication between many services where each service needs the full event history
- Situations where you need to replay events — for debugging, for bootstrapping new consumers, for rebuilding state
In an interview: "For the activity feed service in this design, I'd use Kafka. We have dozens of downstream consumers that all care about the same events (search indexing, recommendation engine, notifications, analytics), the volume is millions of events per hour, and we need replayability — if the recommendation model changes, we want to rebuild its training data from historical events."
ActiveMQ
Apache ActiveMQ is the oldest of the three. It's a JMS-compliant message broker that was the de facto standard for Java enterprise messaging in the 2000s. It still has a significant footprint in Java-heavy enterprise environments, banks, telcos, and government systems.
Core model: Queues and topics, following the JMS specification. JMS (Java Message Service) is a standardized API for messaging, and ActiveMQ's entire design is oriented around implementing it correctly. The result feels more like an enterprise middleware product and less like a high-throughput modern broker.
Strengths:
- JMS compliance. If you're in a Java enterprise shop and your apps are already coded against JMS, ActiveMQ is a drop-in choice.
- Multi-protocol support. AMQP, MQTT, STOMP, OpenWire — useful for heterogeneous environments.
- Advanced per-message features. Message prioritization, scheduled delivery, redelivery policies, message TTL. More per-message control than Kafka.
- Configurable persistence. Choose file-based storage (KahaDB, the default) or database-backed (JDBC).
- Mature. Like RabbitMQ, it's been around long enough that the rough edges are well-documented.
Weaknesses:
- Performance ceiling below Kafka and modern RabbitMQ. Classic ActiveMQ handles thousands of messages per second per broker, not tens of thousands or millions.
- Operationally dated. The default master-slave replication is simpler than Kafka's but also less resilient. Clustering at scale is painful.
- Smaller ecosystem than Kafka or RabbitMQ. Fewer client libraries, smaller community, fewer recent books and tutorials.
- Less cloud-native. Kafka has managed offerings everywhere (Confluent, AWS MSK, Azure Event Hubs). RabbitMQ has CloudAMQP and AWS MQ. ActiveMQ has AWS MQ but is less commonly offered as managed.
- ActiveMQ Artemis vs Classic confusion. There are two actively-maintained ActiveMQ products: Classic (the original) and Artemis (the next-generation rewrite). They share branding and some protocol support but have different internals. Pick one and commit; don't confuse the two.
Best use cases:
- Java enterprise environments where JMS is already the standard
- Legacy system integration — ActiveMQ is often already installed, and replacing it isn't worth the migration cost
- Small-to-medium throughput requirements (thousands of messages per second) where you need rich per-message features
- Hybrid protocol environments where you want one broker speaking AMQP, MQTT, and STOMP to different clients
In an interview: Honestly, you probably won't pick ActiveMQ in a greenfield system design interview unless you have a specific reason (JMS compliance, existing enterprise environment, mixed-protocol requirement). The correct framing is: "ActiveMQ is the right choice if we're in a Java enterprise environment with existing JMS integrations. For a greenfield design at this scale, I'd default to Kafka or RabbitMQ depending on whether we need high-throughput streaming or low-latency task queues." That answer shows you understand the positioning.
Side-by-side comparison
Here's the reference table. Print this out, laminate it, keep it by your desk.
| Feature | RabbitMQ | Kafka | ActiveMQ |
|---|---|---|---|
| Core model | Queue + exchange routing | Distributed log / partitioned topics | JMS queue/topic |
| Typical throughput | 10K–100K msgs/sec per node | 1M+ msgs/sec per cluster | 1K–10K msgs/sec per node |
| Typical latency | <5ms | 10–50ms | <10ms |
| Message retention | Until consumed (short-term) | Days to months (configurable) | Until consumed (short-term) |
| Replay support | No (once consumed, gone) | Yes (rewind to any offset) | No (with some exceptions) |
| Ordering guarantee | Per-queue | Per-partition | Per-queue/topic |
| Delivery guarantee | At-most-once, at-least-once | At-most-once, at-least-once, exactly-once | At-most-once, at-least-once, once-and-only-once |
| Message priority | Yes | No | Yes |
| Routing flexibility | High (exchanges, bindings) | Low (topic + partition) | Medium (selectors) |
| Protocol | AMQP (+ MQTT, STOMP) | Custom binary (Kafka protocol) | JMS, AMQP, MQTT, STOMP, OpenWire |
| Replication | Mirrored / quorum queues | Built-in partition replication | Master-slave |
| Managed cloud offerings | CloudAMQP, AWS MQ | Confluent Cloud, AWS MSK, Azure Event Hubs, GCP Pub/Sub Lite | AWS MQ |
| Language/runtime | Erlang | Java (JVM) | Java (JVM) |
| Best for | Task queues, complex routing | Event streaming, event sourcing, analytics | Java enterprise, JMS legacy |
Notice the pattern: each broker wins on the dimensions its design prioritized. None of them is "the best" — they're different tools for different jobs.
Decision framework: which broker when?
In interviews and in real architecture decisions, the way to pick cleanly is to ask a short sequence of questions. Here's the one I use:
1. What's your throughput, really?
Be honest. Not "what do we aspire to" — what are you actually designing for?
- Under 10,000 messages per second: Any of the three works. Pick on other criteria.
- 10K–100K: RabbitMQ or Kafka. ActiveMQ is possible but will need careful tuning.
- 100K+: Kafka, strongly. RabbitMQ can be pushed to these numbers but it fights you.
- 1M+: Kafka or a managed alternative (like Google Pub/Sub, AWS Kinesis).
2. Do you need to replay messages?
If your answer is yes — because you're doing event sourcing, because new services need historical events, because you want to rebuild state after a bug — Kafka. This is the single biggest differentiator. RabbitMQ and ActiveMQ are message passers; Kafka is a message log.
3. Do you need complex routing?
"Route this message to queue X if the payload matches condition Y, and also to queue Z if the producer tag is P" — that's RabbitMQ's specialty. Kafka's routing is dead simple (topic + partition), which is great for throughput but bad for sophisticated filtering. If your routing rules are non-trivial, RabbitMQ wins.
4. What does your ops team actually know?
This one is underrated. Kafka in production is a commitment. If your team doesn't have Kafka experience and you're not using a managed service, you will spend months on operational issues. RabbitMQ is operationally simpler. ActiveMQ is simpler still. Pick the broker your team can run, not the broker that looks best on a feature comparison.
5. What's your latency requirement?
For per-message latency on a moderate workload, RabbitMQ wins by a clean margin. Kafka is throughput-optimized, not latency-optimized — individual messages go through a batching layer that adds milliseconds. If you're building a real-time trading system or a multiplayer game, RabbitMQ (or something like Redis Streams) is closer to what you want.
6. Is this an event-sourcing-or-streaming shape of problem, or a task-queue shape of problem?
This framing helps a lot:
- Streaming shape: many producers, many consumers, each consumer cares about all events or a filtered view, events matter beyond their immediate delivery. → Kafka.
- Task queue shape: producers create work items, consumers do the work and move on, the items don't need to be retained. → RabbitMQ.
- Enterprise integration shape: JMS-standard Java apps, existing infrastructure, compliance requirements. → ActiveMQ.
How this comes up in system design interviews
A few canonical interview questions and the broker reasoning I'd use for each:
Q: "Design a real-time notification service."
"I'd use Kafka for the core event stream — notifications need to fan out to many consumers (push, email, SMS, in-app), and if the push service goes down, I want the events to still be in Kafka when it comes back up so we can replay them. I'd put RabbitMQ in front of the actual delivery workers as a task queue, because delivery is a job-processing pattern where each notification is handled once by one worker."
That answer uses both brokers for the parts they're each best at — which is often the right real-world answer.
Q: "Design a ride-sharing service's event pipeline."
"Kafka for the driver location updates — high throughput, many downstream consumers (matching service, analytics, fraud detection, ETAs), and we want replay for debugging. Separately, I'd use RabbitMQ for the payment processing task queue — payments are a classic task queue, we need per-message priority (urgent payment failures jump the queue), and throughput is manageable."
Q: "Design a chat application."
"For the message delivery fan-out, Kafka — every message in a group chat needs to be delivered to many subscribers and we want replayability for offline users who come back online. For presence updates, which are ephemeral and low-latency, I'd use Redis pub/sub instead of Kafka — the events don't need to be durable."
You can dig deeper into the chat app design at how to design a chat application.
Q: "Would you ever pick ActiveMQ?"
"In a greenfield design, probably not. I'd pick it if we were in a Java enterprise environment with existing JMS integrations, where migrating to Kafka or RabbitMQ would cost more than the performance gain. It's a legacy-compatibility choice, not a new-architecture choice."
That answer shows you understand positioning, not just features.
Common interview follow-ups
Once you've picked a broker, the follow-ups tend to land on:
- "How does Kafka guarantee ordering?" (Per-partition, using offsets. Messages in the same partition are strictly ordered; across partitions, there's no global ordering.)
- "What delivery guarantees does your broker provide?" (At-most-once, at-least-once, exactly-once. Exactly-once is expensive — Kafka supports it via transactions; RabbitMQ needs idempotent consumers + ack protocols. See idempotency in system design for the consumer side of this.)
- "How do you handle a consumer that falls behind?" (Kafka: scale consumer group, increase partitions. RabbitMQ: prefetch tuning, more workers, per-consumer concurrency.)
- "What happens when the broker cluster has a network partition?" (This is a CAP/consistency question. Kafka favors availability in many configurations; RabbitMQ's quorum queues favor consistency. See CAP vs PACELC for the framework.)
- "How do you handle a poison message that keeps failing?" (Dead letter queues — supported by all three, configured differently.)
- "How do you scale broker throughput?" (Kafka: add partitions and brokers. RabbitMQ: shard queues, add nodes with federation. ActiveMQ: network-of-brokers topology.)
Seniors know these answers cold. If you can handle all six without stumbling, you're solid on the broker portion of the interview.
Putting it all together
Here's the one-sentence version: Kafka is a durable event log. RabbitMQ is a task queue with smart routing. ActiveMQ is the JMS broker you keep because you already have it.
Most real systems end up using more than one — Kafka for the event streaming backbone, RabbitMQ for the task queues, and occasionally ActiveMQ for a specific JMS-integrated legacy system. That's not indecision; that's matching the tool to the job.
In an interview, don't just name a broker. Name the shape of the problem first, then name the broker that fits that shape. If you can do that for each component of a distributed system you're designing, you've graduated from "knows the names of messaging systems" to "knows how to actually design with them."
Good luck with your next interview.
Keep learning
Topics this post touched, with the right next reads:
- Messaging Patterns: Queues, Pub/Sub, and Event Streams — the pillar post for messaging patterns, broader than any single broker.
- Kafka Streams vs Apache Flink vs Apache Storm — the differentiation post for stream processing engines (not brokers). Read this if you need to transform streams, not just move them.
- Observer vs Pub-Sub Pattern — the conceptual pattern underlying most broker designs.
- Synchronous vs Asynchronous Communication — the foundational trade-off that motivates using a broker in the first place.
- Grokking Webhooks — the other common async pattern, useful to contrast with broker-based messaging.
- High Availability in System Design — relevant when you're configuring broker clustering and replication.
- CAP Theorem vs PACELC — the framework for understanding broker behavior during network partitions.
- Idempotency in System Design — critical for writing correct message consumers.
For the full system design interview roadmap, start with my complete system design interview guide.
What our users say
Brandon Lyons
The famous "grokking the system design interview course" on http://designgurus.io is amazing. I used this for my MSFT interviews and I was told I nailed it.
Arijeet
Just completed the “Grokking the system design interview”. It's amazing and super informative. Have come across very few courses that are as good as this!
Eric
I've completed my first pass of "grokking the System Design Interview" and I can say this was an excellent use of money and time. I've grown as a developer and now know the secrets of how to build these really giant internet systems.
Designgurus on Substack
Deep dives, systems design teardowns, and interview tactics delivered daily.
Access to 50+ courses
New content added monthly
Certificate of completion
$29.08
/month
Billed Annually
Recommended Course

Grokking the System Design Interview
169,039+ students
4.7
Grokking the System Design Interview is a comprehensive course for system design interview. It provides a step-by-step guide to answering system design questions.
View Course