Elevating Kafka: Driving operational excellence with Albertsons + Forrester | Watch Webinar

Presentation

Apache Kafkaโ€™s Transactions in the Wild! Developing an exactly-once KafkaSink in Apache Flink

ยซ Kafka Summit London 2022

Apache Kafka is one of the most commonly used connectors with Apache Flink for exactly-once streaming use cases. The combination of both systems allows you to build mission-critical systems that require low end-to-end latency and exactly-once processing eg. banks processing transactions. In Apache Flink 1.14, we released a new KafkaSink based on Apache Flinkโ€™s unified Sink interface that natively supports streaming and batch executions.

However, we needed to stretch Kafkaโ€™s transactions API to fully support exactly-once processing in Flink. In this talk, we will start with a quick recap of Apache Kafkaโ€™s transactions and Flinkโ€™s checkpointing mechanism. Then, we describe the two-phase commit protocol implemented in KafkaSink in-depth and emphasize the difficulties we have overcome when applying Kafkaโ€™s transaction API to longer-lasting transactions. We explain how we ensure performant writing to Apache Kafka and how the KafkaSink recovery works.

In summary, this talk should give users a deep dive into how Apache Flink leverages Apache Kafkaโ€™s transactions and developers an overview of what they have to consider when using Apache Kafkaโ€™s transactions.

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how