In the world of disaster recovery, many things can ruin your day. Software bugs, human error, sharks…if you can think of it, you should prepare for it. When it comes to the cloud, we have many factors outside our control: region outages, latency between regions, and less control over the full stack. Understanding how to deploy Apache Kafka® in the cloud in a way that is resilient to these challenges is important, particularly for Kafka which often sits at the heart of a business’ data infrastructure.
Yet, designing a resilient, multi-region cloud architecture for Kafka can be daunting. What starts as “How do I fail over Kafka?” quickly turns into “How do I fail my entire system over?” as the knock-on effects of losing each component are discovered. From my experience designing disaster-resilient architectures, I’ve grown familiar with patterns and antipatterns that can be applied universally to Kafka multi-region cloud architectures, including:
In this talk, we’ll discuss these in-depth, along with questions you should ask yourself to guide you to the architecture that solves your business needs.