์‹ค์‹œ๊ฐ„ ์›€์ง์ด๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ๊ฐ€์ ธ๋‹ค ์ค„ ๊ฐ€์น˜, Data in Motion Tour์—์„œ ํ™•์ธํ•˜์„ธ์š”!

Should You Read Kafka as a Stream or in Batch? Should You Even Care?

Should you consume Kafka in a stream OR batch? When should you choose each one? What is more efficient, and cost effective?

In this talk weโ€™ll give you the tools and metrics to decide which solution you should apply when, and show you a real life example with cost & time comparisons.

To highlight the differences, weโ€™ll dive into a project weโ€™ve done, transitioning from reading Kafka in a stream to reading it in batch.

By turning conventional thinking on its head and reading our multi-petabyte Kafka stream in batch using Spark and Airflow, weโ€™ve achieved a huge cost reduction of 65% while at the same time getting a more scalable and resilient solution.

Weโ€™ll explore the tradeoffs and give you the metrics and intuition youโ€™ll need to make such decisions yourself.

Weโ€™ll cover:

  • Costs of processing in stream compared to batch
  • Scaling up for bursts and reprocessing
  • Making the tradeoff between wait times and costs
  • Recovering from outages
  • And much more...

Chinese Japanese Korean

๋ฐœํ‘œ์ž

Ido Nadler
Opher Dubrovsky