[Webinar] Build Your GenAI Stack with Confluent and AWS | Register Now
Even though Kafka is scalable by design, proper handling of over one petabyte of data a day requires much more than Kafka’s scalability. Several challenges present themselves in a data centric business at this scale. These challenges include capacity planning, provisioning, message auditing, monitoring and alerting, rebalancing workloads with changes in traffic patterns, data lineage, handling service degradation and system outages, optimizing cost, upgrades, etc. In this talk we describe how at Pinterest we tackle some of these challenges and share some of the key lessons that we learned in the process. Specifically we will share how we:
• Automate Kafka cluster maintenance
• Manage over 150K partitions
• Manage upgrade lifecycle
• Track / troubleshoot thousands of data pipelines