Register for Demo | Confluent Terraform Provider, Independent Network Lifecycle Management and more within our Q3’22 launch!

Improving fault tolerance and scaling out in Kafka Streams

Kafka Streams is the popular stream processing component of Apache Kafka®. One of its best features is stateful operations. Kafka Streams works hard to ensure stateful operations can scale horizontally and survive failures, but doing so takes time. Kafka Streams offers the concept of ""standby-tasks,"" allowing for near-zero downtime failover, but surprisingly this feature still isn't widely used. The could be for various reasons, from lack of awareness to needing additional resources.

This presentation will cover how standby tasks work and how they're enabled. Additionally, I'll cover the work done in KIP-441 that enables faster scaling out for stateful tasks and provides more balanced stateful assignments. I'll also dive into the consumer rebalance protocol improvements that enable KIP-441 to be effective.

Attendees of this presentation will walk away understanding how and when to use standby tasks, leverage the improvements from KIP-441, and have a deeper understanding of how Kafka Streams works with state.


Bill Bejeck

Bill Bejeck is working at Confluent as an integration architect on the Developer Relations team. He was a software engineer for over 15 years and has regularly contributed to Kafka Streams. Before Confluent, he worked on various ingest applications as a U.S. Government contractor using distributed software such as Apache Kafka, Spark, and Hadoop. He has also written a book about Kafka Streams titled "Kafka Streams in Action."