There are numerous messaging systems out there with use cases for message queuing, distributed messaging, and high-performance event streaming systems. Here we’ll do a deep side-by-side comparison of Apache Kafka®, Apache Pulsar®, and RabbitMQ®—performance, architecture, features, and other differences to help you choose the best open source messaging system.
Apache Kafka is an open source distributed event streaming platform. Based on the abstraction of a distributed commit log, Kafka is capable of handling trillions of events a day with functionality comprising pub/sub, permanent storage, and the processing of event streams. The de facto transport for event streaming use cases Kafka is used by thousands of organizations, from internet giants to car manufacturers to stock exchanges and has more than 5 million lifetime downloads. Kafka is also available as managed service offerings on all major cloud platforms via Confluent Cloud and others.
Apache Pulsar is an open-source distributed messaging system. Originally developed as a queuing system, it has been broadened in recent releases to add event streaming features. Pulsar makes use of Apache BookKeeper™ for its storage layer—a project created at Yahoo as a high-availability solution to Hadoop’s HDFS NameNode (although not ultimately used for that use case). It shares properties with both Kafka and RabbitMQ. Pulsar is a largely community-led project with no enterprise-grade commercial backing today.
RabbitMQ (AMQP) is an open-source traditional message-oriented middleware that implements the AMQP messaging standard. Its capabilities include queuing, exchanges, routing, and low-latency messaging. Written in Erlang, RabbitMQ is developed and commercially supported by Pivotal Software, part of VMware.
Kafka + Zookeeper(ZK is being removed)
Pulsar + Zookeeper + BookKeeper + RocksDB
|Message consumption model|
|Ease of Use|
|Documentation & learning|
|Open source ecosystem|
|Size of user community|
|Managed cloud offerings|
|Management tooling built-in|
|Integrations (databases, REST, COTS, etc.)|
|Client library diversity|
|Performance & availability|
|Global data replication|
|Built-in stream processing|
|Message replay, time travel|
|Topic (log) compaction|
In reality, Kafka, RabbitMQ, and Pulsar are three very different systems. Kafka is a pure distributed log designed for efficient event streaming at a high scale. RabbitMQ is a traditional messaging system, designed to publish messages quickly and delete them. Pulsar sits somewhere in between. It’s not a distributed log in the true sense, but it synthesizes some similar properties.
Which to choose should be a fairly straightforward decision: for lightweight messaging that requires request-response, queuing, and pub-sub RabbitMQ is well suited; Pulsar is really only for the brave at heart, but it may have a place in the future for those that require both queuing and event streaming in the same system; for event streaming use cases that require high throughput, scalability, and permanent message storage Kafka is the clear winner.
New signups only.