Robinhood uses Kafka in every line of its business, from stock and crypto trading to clearing and data analytics. One interesting aspect of our architecture is that many of our microservices leveraging Kafka are written in Python. When you combine Python's relatively slow performance coupled, its reliance on process-based parallelism and Robinhood’s scale, the result is a massive fleet of application processes producing to and consuming from our Kafka clusters. This fleet generates an atypical workload on Kafka that warrants a deeper investment in scalability and reliability.
This talk discusses our investments in Kafka infrastructure for a large-scale Python-based environment:
kafkahood: our librdkafka-based client library wrapper that codifies best practices, sane defaults and deep client-side observability. kafkaproxy: a Rust-based sidecar proxy that reduces connection fan-in from Python gunicorn worker pools to our Kafka clusters.
We'll also present challenges we encountered along the way and share our learnings with the audience.