Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now

Presentation

When Streaming Needs Batch

« Current 2022

A streaming application is started once and then continuously ingests endless, fairly steady streams of events. That's as far as the theory goes.

Unfortunately, reality is more complicated. Over time your application's ability to process large historical data sets robustly, efficiently and correctly will be critical:

  • for exploratory data analysis during development
  • for bootstrapping the initial state of an application
  • for back-filling following an outage or bugfix
  • for keeping up with bursty input streams

These scenarios call for batch processing techniques. Apache Flink is as streaming-first as it gets. Yet over the last releases, the community has invested significant resources into unifying stream- and batch processing on all layers of the stack: scheduler to APIs.

In this talk, I'll introduce Apache Flink's approach to unified stream and batch processing and discuss - by example - how these scenarios can already be addressed today and what might be possible in the future.

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how