How to Build a Data Mesh with Stream Governance | Join Webinar


Intelligent, Automatic Restarts for Unhealthy Kafka Consumers on Kubernetes

« Kafka Summit London 2023

At Cloudflare we are big Kafka adopters and we run Kafka at a massive scale. We deploy our microservices leveraging Kafka on Kubernetes and we have have some interesting experience on how to keep the latter operational to avoid downtime. To do so, we implemented our own Intelligent Smart Health checks for microservices leveraging Kafka. This has allowed our services to be much more self-healing, meaning there is much less manual intervention required. Before we used to get paged when applications got stuck and this also led to different incidents that were also customer impacting. We've implemented this in go, using the Shopify/sarama package but the same concepts can be adopted in different programming languages.

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how