Advanced Streaming Analytics with Apache Flink and Apache Kafka

Kafka Summit 2016 | Systems Track

Speaker

Stephan Ewen

Stephan Ewen, CTO, Data Artisans

Description: 

Flink and Kafka are popular components to build an open source stream processing infrastructure. We present how Flink integrates with Kafka to provide a platform with a unique feature set that matches the challenging requirements of advanced stream processing applications. In particular, we will dive into the following points:

  • Flink’s support for event-time processing, how it handles out-of-order streams, and how it can perform analytics on historical and real-time streams served from Kafka’s persistent log using the same code. We present Flink’s windowing mechanism that supports time-, count- and session- based windows, and intermixing event and processing time semantics in one program.
  • How Flink’s checkpointing mechanism integrates with Kafka for fault-tolerance, for consistent stateful applications with exactly-once semantics.
  • We will discuss “”Savepoints””, which allows users to save the state of the streaming program at any point in time. Together with a durable event log like Kafka, savepoints allow users to pause/resume streaming programs, go back to prior states, or switch to different versions of the program, while preserving exactly-once semantics.
  • We explain the techniques behind the combination of low-latency and high throughput streaming, and how latency/throughput trade-off can configured.
  • We will give an outlook on current developments for streaming analytics, such as streaming SQL and complex event processing.
Kafka Summit 2016