Kora Engine, Data Quality Rules y mucho más en nuestra nueva versión del 2T'23 | Regístrese para la demostración
Apache Kafka is the standard for data streaming, and has one of the largest open source communities in the world. With its popularity, there are a growing number of offerings compatible with Kafka’s API. This page compares Apache Kafka with Redpanda, as well as two different cloud Kafka services - Confluent Cloud and Redpanda Cloud.
Apache Kafka is an open source data streaming technology capable of handling trillions of events per day. It’s based on the abstraction of a distributed commit log, with functionality comprising pub/sub, permanent storage, and the processing of event streams. Created at LinkedIn and open sourced in 2011, Kafka has since been adopted by over 100k organizations worldwide and has a vast developer community and ecosystem.
Redpanda is a C++ clone of Apache Kafka. It is not open source, but a community edition is source available under the BSL license, with enterprise features available with commercial subscription. Redpanda provides a Kafka-compatible API on top of its own implementation of the distributed commit log. Originally founded in 2019, Redpanda is privately owned by a company of the same name.
Apache Kafka and Redpanda both implement their own versions of a highly-available distributed commit log. While there are commonalities between the two, there remain several key design differences that affect usage, performance, and adoption. What are the key differences between Kafka and Redpanda?
Under the Apache License governed by the Apache Software Foundation.
Under the Business Source License (BSL) with proprietary paid features available under an enterprise license.
|Contribution model and commercial backing||
Actively managed and maintained by 1,000+ full-time contributors at over a dozen companies and commercially backed by a broad coalition of vendors.
Solely developed and maintained by Redpanda, with restrictive commercial support from other vendors due to BSL license agreement.
ZooKeeper was removed by KRaft since version 3.3+
ZooKeeper-free and uses the Raft consensus algorithm.
|Storage Pattern and Performance Impact||Consistent performance across most real-world workloads
Kafka has a purpose-built log and replication layer optimized for sequential IO, which allows it to deliver high throughput and low latency across a broad set of hardware and workloads.
|Performance optimized for selective workloads
Redpanda can demonstrate low latency and high throughput on simple workloads. However, because it’s optimized for random IO, its performance can significantly degrade over time.
Several common production configurations, such as high producer count, over 30% disk utilization, enabling message keys, enabling TLS, or running for more than 24 hours can cause severe reductions in performance.
Purpose-built immutable log
Uses its own purpose-built framework. Data is written in large blocks as high throughput sequential IO, allowing for high performance on drives with even very low IOPS.
Based on Seastar
Uses the Seastar framework, popularized by the Scylla Database, to implement its immutable log. Writes data in small 16kB chunks by default, requiring very high IOPS SSDs.
In Progress with KIP-405
Slated for early access in Kafka release 3.6.
Requires Enterprise License
Redpanda’s tiered storage requires the purchase of an enterprise license.
Kafka replication (ISR)
Replication is synchronous but data is written to disk asynchronously by design. Brokers don’t need to fsync for correctness and have in-built data recovery and repair.
Both replication and writing to disk are synchronous.
|Cloud Network Optimized||
Optimized for cloud
Follower Fetching (GA since version 2.4.0) enables clients to read data from follower replicas in the current AZ, avoiding cross-AZ network costs.
Cloud optimization in beta
Follower fetching recently released in version 23.2 Beta.
|Connectors and Stream Processing||
Kafka Connect and Kafka Streams are packaged as part of the core open source Kafka offering. These two components allow you to connect applications and databases together and process data streams at scale.
Not included with Redpanda. While Kafka Connect and Kafka Streams are compatible, they require you to configure, manage, and scale your own JVM-based applications and jobs.
|Breadth of adoption||
Vast developer ecosystem and community
Apache Kafka is used by 100,000+ organizations, including 80% of F100 companies, including Goldman Sachs, Netflix and Uber
Community stats: 200+ global meetups with 41,000+ attendees, 32,000+ stack overflow questions, and 41k Slack community members in the Confluent Community Slack.
Limited adoption and community
Redpanda is used by thousands of organizations (undisclosed)
Community stats: Less than 100 stack overflow questions, and 3.3k Slack community members.
Built by the original co-creators of Kafka in 2018, Confluent Cloud is a cloud-native data streaming platform. Confluent has re-architected Kafka to create a fully-managed service with 10x elasticity, storage, and resiliency. Confluent Cloud offers a complete set of enterprise features to relieve operational burdens and boost developer productivity.
Redpanda launched their fully-managed Kafka service in late 2022. Redpanda Cloud is based on a C++ clone of Kafka and includes all the features from Redpanda’s Enterprise license. The service is offered through two deployment models—single-tenant Dedicated clusters and Bring-Your-Own-Cloud (BYOC) clusters.
Automated and fully-managed Kafka clusters with zero provisioning, scaling, or operational burdens.
Shared infrastructure responsibilities
Infrastructure operations and maintenance considerations shared between customer, Redpanda, and cloud provider.
|Trust and security||
Enterprise grade security and governance
Enterprise grade security (RBAC, authentication support, encryption, etc.) and data governance features; Built-in compliance with all major industry standards (SOC, ISO, PCI, etc.).
Shared security model
Security shared between customer, Redpanda and cloud provider; RBAC and authentication support, compliance with SOC 2 Type 1.
Comprehensive 99.99% SLA
99.99% SLA across 30K+ clusters and 4500+ customers globally, covering infrastructure, bug fixes, upgrades, patching, and more.
SLA agreement not published; Limited number of publicly referenced customers; Fully managed not available for BYOC offering.
Complete set of tools
70+ fully-managed Kafka connectors, SQL-based stream processing, and low-code visual pipeline builder.
Fewer than 10 fully-managed connectors, no stream processing, and no pipeline building solution.
Transparent pricing, volume-based discounts
Publicly available pricing; negotiated volume-based discounts for Kafka workloads with cloud providers.
Pricing not published
Pricing not publicly available; Pay your own negotiated cloud provider rates for compute, storage, and networking.
Developers, architects and operators need a complete, cloud-native data streaming platform that prioritizes ease of use and alleviates operational burdens. That means delivering a Kafka service with fully-managed infrastructure, built-in enterprise-grade tooling, and flexible deployment options.
|Automatic Cluster Rebalancing|
|Infinite Data Retention with Tiered Storage|
|Upgrades and Patches|
|Productivity||SQL-Based Stream Processing|
|Productivity||Low-Code Data Pipelines|
|Monitoring and Alerting||Monitoring|
|Monitoring and Alerting||Notifications|
|Core Cloud Service Providers|
Supports OAuth 2.0, OIDC, API Keys, and plain username/password.
Integrates with existing SAML-based single sign-on (SSO) and identity providers (IdP) such as Google, GitHub, Okta, OneLogin, and Azure Active Directory (AD).
Supports OAuth 2.0, OIDC, and plain username/password.
Direct integrations are limited to Google, GitHub, and Okta.
Encrypt data at rest using your own Kafka cluster management tools. No BYOK option.
Start testing and deploying fully-managed Kafka clusters in Confluent Cloud within minutes—no credit card required to start your 30-day trial.