Register now: The Top Five Use Cases & Architectures for Data In Motion in 2022

Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset

We often need to build applications that analyze Kafka data to unlock the most value from event streams, so how can organizations build these real-time analytics applications? In this talk, we examine an indexing approach that enables fast SQL analytics on data from Kafka, without data flattening or denormalization. Rockset is the real-time indexing database that builds an inverted index, a columnar index and a row index on all fields of your Kafka messages, including nested fields and arrays. This Converged Index accelerates various types of analytic queries–search, aggregations and joins–without the need to denormalize or transform data for performance reasons. With indexing delivering significant gains in query performance, we also need to index new data in a timely manner. We discuss several strategies used for efficient ingestion and indexing from Kafka, including rollups, write optimizations on the underlying RocksDB storage engine, and the disaggregation of ingest and query compute.


Dhruba Borthakur