Hands-on Workshop: ZooKeeper to KRaft Without the Hassle | Secure Your Spot

Presentation

Implementing Real-Time Analytics with Kafka Streams

« Kafka Summit London 2023

Real-time analytics requires manifold query types, which, for example, perform custom aggregations over groups of events, such as ordered sequences of selected events that meet certain criteria on keys or respective values. In practice, there are several query types which are not yet supported out-of-the-box. Kafka Streams provides state stores, which enable maintaining arbitrary state. These state stores can be used to store data ready for real-time analytics. Interactive Queries facilitate to leverage application state from outside the streams application. With this, you can retrieve individual events or ranges of events as input for aggregation operations in real-time analytics.

As an example, we focus on order-preserving range queries. As one can read in the documentation, Kafka Streams’ Interactive Queries on ranges do not provide ordering guarantees. This may result in insufficient analytics and sorting large sets of events in main memory is not always feasible.

In our talk, we discuss different approaches and highlight an indexing strategy for guaranteeing the order of a range query. We will discuss the pros and cons and finally demonstrate a real-world example of our solution. Furthermore, we showcase how our approach can also be applied to other implementations of custom analytics queries.

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how