Live demo: Kafka streaming in 10 minutes on Confluent | Register now

Temporal-Joins in Kafka Streams and ksqlDB

Joins in Kafka Streams and ksqlDB are a killer-feature for data processing and basic join semantics are well understood. However, in a streaming world records are associated with timestamps that impact the semantics of joins: welcome to the fabulous world of temporal join semantics. For joins, timestamps are as important as the actual data and it is important to understand how they impact the join result.

In this talk we want to deep dive on the different types of joins, with a focus of their temporal aspect. Furthermore, we relate the individual join operators to the overall ""time engine"" of the Kafka Streams query runtime and explain its relationship to operator semantics. To allow developers to apply their knowledge on temporal join semantics, we provide best practices, tip and tricks to ""bend"" time, and configuration advice to get the desired join results. Last, we give an overview of recent, and an outlook to future, development that improves joins even further.


Matthias J. Sax

Matthias is a Kafka PMC member and software engineer at Confluent, working mainly on Kafka’s Stream API. Prior to Confluent, he was a PhD student at Humboldt-University of Berlin, conducting research on the data stream processing system. Matthias is also a committer at Apache Flink and Apache Storm.