Company

Reflections on Event Streaming as Confluent Turns Five – Part 1

Tim Berglund
Last Updated: 

For me, and I think for you, technology is cool by itself. When you first learn how consistent hashing works, it’s fun. When you finally understand log-structured merge trees, it’s a rewarding feeling. When Apache Kafka® consumer group rebalancing clicks, you feel good.

But the thing that really gets me going isn’t mastering a specific new technical chop, but seeing how new technologies have an impact on the way we think. And I don’t mean in the Nick Carr, internet-is-making-us-stupid sense—that’s a different blog post entirely—but I mean how a new piece of infrastructure changes the kinds of software architectures we’re willing and able to consider, well beyond the details of how any part of that infrastructure actually works. This is one of those intellectual influences that sometimes passes beyond notice, but it’s something that’s definitely happening with event streaming.

Now that there’s a ubiquitous open source Apache Kafka, an enterprise-ready Confluent Platform, and a robust and increasingly featureful Confluent Cloud, the way we build systems is changing. I love being able to witness this kind of transition.

Why think about this kind of thing now? Well, it’s Confluent’s fifth birthday, and birthdays are always a good time for looking back and looking forward. As we get ready to turn five, and on the cusp of what should be a very exciting Kafka Summit in San Francisco, I wanted to reflect a little bit on the things that get me most excited about being in the Kafka community.

It was my understanding we would be able to scale

When Amazon EC2 launched 13 years ago, we told ourselves a story. Suddenly, deployments could be totally elastic. We could now just spin up instances in the cloud at Christmastime when the load on our ecommerce system was peaking, and we’d have all the extra compute and storage we’d ever need; then in January when traffic tapered off, we’d scale down and magically reduce our costs. It was a powerful story, and it was not entirely untrue—there really was an API you could use to create and destroy cloud instances—but apart from that new infrastructure capability, who was building systems that could scale like that? Nobody I knew. Certainly not me.

But we as an industry turn out to be decent debtors. Not-entirely-related things like deployment automation, containers, and microservices seem to have snuck around the back and put us in the position to pay the promissory note of scalability that we’ve been holding since before Katy Perry was all that much on the radio.

And those things are indispensable, but what’s really putting us over the top is the ability we have now to build our systems on top of an event streaming platform like Kafka. Once we take our perfectly useful monoliths and break them into little programs that run on separate computers, those little programs still need to communicate. And a strongly emerging consensus is that the best way to get that done is to connect them through messaging, and to store the messages they exchange in an immutable commit log that can serve as a replayable history of those communications going forward.

I like it when I can see a mature company that’s been riding technology wave after wave for decades transition to Kafka and start making good on the scalability stories we’ve been telling ourselves for a long time. And it’s just as cool when I hear the director of engineering for a new startup in the retail space drop a quotable quote like, “Confluent Cloud is the central nervous system that runs our business.” People with real money on the table are trusting Kafka and its ability to help them grow complex software stacks that power their businesses.

No longer missing out on microservices

I have long noticed that many enterprise developers share a certain sheepish feeling, like there’s some trend that they know they are supposed to be following but they’re not; or maybe they are following it a little, but they feel like they are doing it poorly. It almost doesn’t matter what the trend is—it changes over time, but what remains constant is the feeling of behindness. Right now one of those trends is microservices, which developers at big companies have been talking about for years. And for many of those years, all of the various ways they’ve tried to jump on the microservices train have not led to satisfactory outcomes, to put it mildly. So they know that microservices are the path forward, but they are stuck asking “how?” I love to see lightbulbs go off when people realize that Kafka is the answer to that question.

For example: Ticketmaster. After 40 years of innovating in the ticket sales and distribution business (they were certainly how one bought concert tickets by phone in my own pre-internet youth), Ticketmaster recognized that they had to deal with the all-too-familiar problem of untamed complexity in the system they were building. They had developed hundreds of independent services that all interacted with one another in different ways, and the complexity of these interactions made it hard for a developer of one service to reason about (or for that matter, modify) a service with a different set of interfaces. This had potentially set up the stack to be as resistant to change as a monolith without actually being one, which would have put the team on the wrong side of all of the monolith/microservices tradeoffs. Nobody wants to live like that—least of all Ticketmaster, which is why they refactored to an event streaming architecture, now with hundreds of microservices exchanging inputs and outputs in real time through Kafka topics. Just as one should.

If you’ve ever bought tickets online, you have probably seen that event ticketing is fundamentally a real-time problem. Buyers need literally up-to-the-second information on seat availability and pricing, and things can change fast for highly contended events. Ticketmaster relies heavily on KSQL and Kafka Streams to build the systems that get this done. I’ll spare you the details of how they do it, since their VP of Engineering, Chris Smith, talked it over with Confluent’s own Dani Traphagen in this recent online talk. And if you want to know where things are headed from here, Ticketmaster’s Derek Cline will be speaking at the upcoming Kafka Summit I mentioned about his team’s success in reusing business logic from existing services that use the Kafka Streams API.

In a use case like online ticketing, it may seem obvious that the transactional side of the system is well suited to an event processing architecture, but certain of the analytical requirements demand the same architecture. A previous-generation ETL system would have a hard time running the machine learning models that help shut down scalpers, fraudsters, and other scammers. Fraud detection applications like this are properly a part of the analytics pipeline, yet operate under real-time requirements—something Kafka makes it possible to do.

This wasn’t an architecture that was practical to build just 10 years ago. We were clued into the idea that building systems by composing services was a good idea, but very few people were pulling it off. And Ticketmaster is still a thought leader, but they are showing that this architecture can now become commonplace.

As it does, Kafka not only helps us make good on our long-term promise to build scalable, composable systems, but it slowly transforms the way we think about those systems. A business is no longer operated by a large program that babysits its state in a single do-or-die database, but instead by any number of evolvable services that maintain their own state and communicate through scalable logs. This is a significant change in an architect’s frame of mind, and it’s an example of the kind of change I love seeing. On the occasion of Confluent’s fifth birthday, it’s a thing I can sit back and enjoy.

Other articles in this series

Tim Berglund is a teacher, author and technology leader with Confluent, where he serves as the senior director of developer experience. He can frequently be found at speaking at conferences in the U.S. and all over the world. He is the co-presenter of various O’Reilly training videos on topics ranging from Git to distributed systems, and is the author of Gradle Beyond the Basics. He tweets as @tlberglund and lives in Littleton, CO, U.S., with the wife of his youth and their youngest child, the other two having mostly grown up.

Subscribe to the Confluent Blog

Subscribe

More Articles Like This

Real-Time Analytics and Monitoring Dashboards with Apache Kafka and Rockset
Shruti Bhat

Real-Time Analytics and Monitoring Dashboards with Apache Kafka and Rockset

Shruti Bhat

In the early days, many companies simply used Apache Kafka® for data ingestion into Hadoop or another data lake. However, Apache Kafka is more than just messaging. The significant difference […]

Every Company is Becoming  a  Software  Company
Jay Kreps

Every Company is Becoming a Software Company

Jay Kreps

In 2011, Marc Andressen wrote an article called Why Software is Eating the World. The central idea is that any process that can be moved into software, will be. This […]

The Rise of Managed Services for Apache Kafka
Ricardo Ferreira

The Rise of Managed Services for Apache Kafka

Ricardo Ferreira

As a distributed system for collecting, storing, and processing data at scale, Apache Kafka® comes with its own deployment complexities. Luckily for on-premises scenarios, a myriad of deployment options are […]

Fully managed Apache Kafka as a Service!

Try Free