Build your real-time bridge to the cloud with Confluent Platform 7.0 and Cluster Linking | Read the blog

Temporal-Joins in Kafka Streams and ksqlDB

Joins in Kafka Streams and ksqlDB are a killer-feature for data processing and basic join semantics are well understood. However, in a streaming world records are associated with timestamps that impact the semantics of joins: welcome to the fabulous world of temporal join semantics. For joins, timestamps are as important as the actual data and it is important to understand how they impact the join result.

In this talk we want to deep dive on the different types of joins, with a focus of their temporal aspect. Furthermore, we relate the individual join operators to the overall ""time engine"" of the Kafka Streams query runtime and explain its relationship to operator semantics. To allow developers to apply their knowledge on temporal join semantics, we provide best practices, tip and tricks to ""bend"" time, and configuration advice to get the desired join results. Last, we give an overview of recent, and an outlook to future, development that improves joins even further.

Presentador

Matthias J. Sax

Matthias is an Apache Kafka committer and PMC member, and works as a software engineer at Confluent. His focus is data stream processing in general, and thus he contributes to ksqlDB and Kafka Streams. Before joining Confluent, Matthias conducted research on distributed data stream processing systems at Humboldt-University of Berlin, were he received his Ph.D. Matthias is also a committer at Apache Flink and Apache Storm.