Confluent Cloud Q1 Launch: Build a Secure Shared Services Data Streaming Platform | Learn more

Organic Growth and A Good Night Sleep: Effective Kafka Operations at Pinterest

Even though Kafka is scalable by design, proper handling of over one petabyte of data a day requires much more than Kafka’s scalability. Several challenges present themselves in a data centric business at this scale. These challenges include capacity planning, provisioning, message auditing, monitoring and alerting, rebalancing workloads with changes in traffic patterns, data lineage, handling service degradation and system outages, optimizing cost, upgrades, etc. In this talk we describe how at Pinterest we tackle some of these challenges and share some of the key lessons that we learned in the process. Specifically we will share how we:
• Automate Kafka cluster maintenance
• Manage over 150K partitions
• Manage upgrade lifecycle
• Track / troubleshoot thousands of data pipelines


Vahid Hashemian
Ambud Sharma