[Webinar] 4 Tips for Cutting Your Kafka Costs Up to 60%| Register Now

Presentation

How to Design a Kafka Architecture Resilient to Cloud Outages

¬ę Current 2022

In the world of disaster recovery, many things can ruin your day. Software bugs, human error, sharks‚Ķif you can think of it, you should prepare for it. When it comes to the cloud, we have many factors outside our control: region outages, latency between regions, and less control over the full stack. Understanding how to deploy Apache Kafka¬ģ in the cloud in a way that is resilient to these challenges is important, particularly for Kafka which often sits at the heart of a business‚Äô data infrastructure.

Yet, designing a resilient, multi-region cloud architecture for Kafka can be daunting. What starts as ‚ÄúHow do I fail over Kafka?‚ÄĚ quickly turns into ‚ÄúHow do I fail my entire system over?‚ÄĚ as the knock-on effects of losing each component are discovered. From my experience designing disaster-resilient architectures, I‚Äôve grown familiar with patterns and antipatterns that can be applied universally to Kafka multi-region cloud architectures, including:

  • When to choose Active-Active or Active-Passive given RPO/RTO requirements
  • When to fail forward vs fail back
  • Topic best practices for easy client failover

In this talk, we’ll discuss these in-depth, along with questions you should ask yourself to guide you to the architecture that solves your business needs.

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how