Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now

White Paper

The Cloud-Native Chasm: Lessons Learned from Reinventing Apache Kafka as a Cloud-Native, Online Service

Download Now

Today we face an inflection point where infrastructure software—databases, messaging, orchestration, etc.—are more likely to be consumed as a managed service than as a binary software artifact the user runs and supports themselves. However, the majority of infrastructure software that we consume was designed and written before this shift to software services became prominent. Take Apache Kafka® as an example. Apache Kafka is a distributed message broker used by hundreds of thousands of organizations to connect applications using streams of events. Kafka was initially designed and built for LinkedIn’s self-managed infrastructure, deployed on physical servers over many LinkedIn-owned datacenters.

Over the last five years, Confluent has turned Apache Kafka into a successful online service. This has been a significant technical undertaking, one we estimate to have taken more investment than creating the Apache Kafka software it is based on. Our experience has unearthed what we see as a chasm between successful infrastructure software projects designed for on-premises workloads and successful online services created for the same software.

Adding further confusion to this space, all cloud services are not created equal, and hence we use the term cloud native to differentiate systems like Confluent Cloud, which provides a fully managed, elastic, pay-as-you-go experience from other hosted offerings that simply run open source software on the user’s behalf.

In this paper, we explore the aforementioned chasm that exists between successful software distributions and cloud-native services that match the above definition. We do this by sharing our experiences building a cloud-native system that spans many thousands of individual clusters. These experiences should be helpful to anyone building their own cloud-native system and to those looking to gain an in-depth understanding of what goes on behind the scenes.

This paper was authored by Gwen Shapira, Prachetaa Raghavan, Alok Nikhil, Adithya Chandra, Anna Povzner, Anastasia Vela, Rohit Shekhar, Alok Thatikunta, and Marc Selwan.