OSS Kafka couldn’t save them. See how data streaming came to the rescue! | Watch now

Presentation

Building Retry Architectures in Kafka with Compacted Topics

« Kafka Summit Americas 2021

In this talk, we'll discuss how VillageMD is able to use Kafka topic compaction for rapidly scaling our reprocessing pipelines to encompass hundreds of feeds. Within healthcare data ecosystems, privacy and data minimalism are key design priorities. Being able to handle data deletion in a reliable, timely manner within event-driven architectures is becoming more and more necessary with key governance frameworks like the GDPR and HIPAA.

We'll be giving an overview of the building and governance of dead-letter queues for streaming data processing.

We'll discuss:

How to architect a data sink for failed records.
How topic compaction can reduce duplicate data and enable idempotency.
Building a tombstoning system for removing successfully reprocessed records from the queues.
Considerations for monitoring a reprocessing system in production -- what metrics, dataops, and SLAs are useful?

Presenter

Matthew Zhou

Peloton

Matthew Zhou is a senior data engineer at Peloton, where he specializes in data governance and infrastructure, and a tech policy fellow at the Aspen Institute. Previously, Matthew was an engineering manager at VillageMD and was a data engineer at the New York Times. Matthew graduated from Columbia University with an MPH in healthcare informatics, and also has a degree in anthropology from Northwestern University. In his spare time, he loves trying out woodworking projects, taking long cycling rides, and getting lost in a really good sci-fi book!

Building Retry Architectures in Kafka with Compacted Topics

Presenter

Matthew Zhou

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how