Build your real-time bridge to the cloud with Confluent Platform 7.0 and Cluster Linking | Read the blog

Taming a Massive Fleet of Python-based Kafka Apps at Robinhood

Robinhood uses Kafka in every line of its business, from stock and crypto trading to clearing and data analytics. One interesting aspect of our architecture is that many of our microservices leveraging Kafka are written in Python. When you combine Python's relatively slow performance coupled, its reliance on process-based parallelism and Robinhood’s scale, the result is a massive fleet of application processes producing to and consuming from our Kafka clusters. This fleet generates an atypical workload on Kafka that warrants a deeper investment in scalability and reliability.

This talk discusses our investments in Kafka infrastructure for a large-scale Python-based environment:

kafkahood: our librdkafka-based client library wrapper that codifies best practices, sane defaults and deep client-side observability. kafkaproxy: a Rust-based sidecar proxy that reduces connection fan-in from Python gunicorn worker pools to our Kafka clusters.

We'll also present challenges we encountered along the way and share our learnings with the audience.

Presentadores

Chandra Kuchi

I am Chandra, Engineering Manager on streaming platform at Robinhood. We work on enabling real time streaming systems at Robinhood. Before robinhood worked as backend engineer and help scale billing systems and wireless product at Twilio.

Nick Dellamaggiore

Nick is a software engineer who has been working to scale Kafka and related infrastructure at Robinhood. Prior to Robinhood, Nick built and scaled infrastructure and backend systems at Coursera and LinkedIn.