New in Confluent Cloud: Making Data & Pipelines Accessible for AI-Ready Streaming | Learn More

Presentation

Troubleshooting Long JVM Pauses in Kafka

« Kafka Summit London 2024

Despite being nearly three decades old and lauded for its reliability and performance, under certain (but reasonable!) conditions the JVM can completely hang and freeze your application. Long pauses in data infrastructure services or distributed applications can trigger a cascade of failures, instability and degraded service availability.

When running Kafka at a very large scale on bare-metal hosts it was discovered that in some clusters, brokers exhibited frequent pauses for several seconds, and in rarer, more extreme cases, lasting over three minutes!

I learnt this costly lesson the hard way but you don't have to.

In this session you'll learn:

  • How the JVM can completely pause your cluster or application for critical amounts of time
  • How to monitor for long pauses in your Kafka clusters or JVM-based Kafka applications
  • Why Kafka is particularly vulnerable to this issue
  • How you can mitigate this issue

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how