Advanced Streaming Analytics with Apache Flink and Apache Kafka

Kafka Summit 2016 | Systems Track


Stephan Ewen

Stephan Ewen, CTO, Data Artisans


Flink and Kafka are popular components to build an open source stream processing infrastructure. We present how Flink integrates with Kafka to provide a platform with a unique feature set that matches the challenging requirements of advanced stream processing applications. In particular, we will dive into the following points:

  • Flink’s support for event-time processing, how it handles out-of-order streams, and how it can perform analytics on historical and real-time streams served from Kafka’s persistent log using the same code. We present Flink’s windowing mechanism that supports time-, count- and session- based windows, and intermixing event and processing time semantics in one program.
  • How Flink’s checkpointing mechanism integrates with Kafka for fault-tolerance, for consistent stateful applications with exactly-once semantics.
  • We will discuss “”Savepoints””, which allows users to save the state of the streaming program at any point in time. Together with a durable event log like Kafka, savepoints allow users to pause/resume streaming programs, go back to prior states, or switch to different versions of the program, while preserving exactly-once semantics.
  • We explain the techniques behind the combination of low-latency and high throughput streaming, and how latency/throughput trade-off can configured.
  • We will give an outlook on current developments for streaming analytics, such as streaming SQL and complex event processing.
Kafka Summit 2016