Project Metamorphosis: Unveiling the next-gen event streaming platformLearn More

Announcing ksqlDB 0.8.0

The latest ksqlDB release introduces long-awaited features such as tunable retention and grace period for windowed aggregates, new built-in functions including LATEST_BY_OFFSET, a peek at the new server API under development, and more.

Tunable retention and grace period for windowed aggregates

ksqlDB supports window-based aggregation of events. You can use the following query, for instance, to count the number of pageviews per region in each minute-long interval:

CREATE TABLE pageviews_per_region AS
    SELECT regionid, COUNT(*) from pageviews
    WINDOW TUMBLING (SIZE 1 MINUTE)
    GROUP BY regionid
    EMIT CHANGES;

However, the aggregation results for each (minute-long) window can’t be stored forever as storage space is finite. The amount of time ksqlDB stores windowed results for is the retention time.

With ksqlDB 0.8.0, this retention time is configurable in the aggregation statement itself by specifying a retention clause, such as RETENTION 7 DAYS.

CREATE TABLE pageviews_per_region AS
    SELECT regionid, COUNT(*) from pageviews
    WINDOW TUMBLING (SIZE 1 MINUTE, RETENTION 7 DAYS)
    GROUP BY regionid
    EMIT CHANGES;

This is very helpful for applications that rely on ksqlDB’s query serving layer. For example, it’s now possible to materialize windowed metrics inside ksqlDB and store the data for weeks or months to directly support a dashboard served by ksqlDB.

When working with windowed aggregates, another parameter of interest is the grace period—the length of time each window continues to accept and process late-arriving events after the end of the window. This is important because event streaming applications often have to deal with events arriving later than the end of their expected window.

With ksqlDB 0.8.0, the grace period for late-arriving events included in windowed computations can be configured by specifying a clause such as GRACE PERIOD 10 MINUTES in the window expression:

CREATE TABLE pageviews_per_region AS
    SELECT regionid, COUNT(*) from pageviews
    WINDOW TUMBLING (SIZE 1 MINUTE, RETENTION 7 DAYS, GRACE PERIOD 10 MINUTES)
    GROUP BY regionid
    EMIT CHANGES;

For more details, see the documentation.

New aggregation function: LATEST_BY_OFFSET

With the release of ksqlDB 0.8.0, we’re happy to introduce one of the most highly requested built-in functions: LATEST_BY_OFFSET. The LATEST_BY_OFFSET aggregation function is used to track the latest value of a column when aggregating events from a stream into a table. As the function name suggests, latest is determined by offset order rather than timestamp order.

For example, consider a stream of IoT data:

CREATE STREAM sensor_readings (sensorID BIGINT, temp DOUBLE, quality INT)
    WITH (KAFKA_TOPIC='readings', VALUE_FORMAT='JSON', KEY='sensorID');

…with these example records:

{"sensorId": 1224, "temp": 56.4, "quality": 3},
{"sensorId": 2658, "temp": 56.4, "quality": 3},
{"sensorId": 1224, "temp": 96.4, "quality": 1},
...

A user interested in capturing the latest temperature and quality value for each sensor can do so with the following query:

CREATE TABLE aggregate_sensor_readings AS
    SELECT sensorId, LATEST_BY_OFFSET(temp), LATEST_BY_OFFSET(quality)
    FROM sensor_readings
    GROUP BY sensorId
    EMIT CHANGES;

And there’s more!

ksqlDB 0.8.0 also includes an early look at the new client and server API proposed in KLIP-15. This internal set of changes lays the groundwork for new features and enhancements to ksqlDB’s developer experience. We’re excited to share more of the KLIP-15 implementation in a future release and in other blog posts coming soon.

This release also made ksqlDB’s integration with Kafka Connect a little easier, as the ksqlDB server and CLI Docker images now ship with confluent-hub, an easy-to-use tool for installing connectors. With the inclusion of confluent-hub in ksqlDB, downloading connectors is as simple as a single command with a Docker image you already have. Check out the tutorial for running Kafka Connect embedded in ksqlDB for an example.

In addition, ksqlDB 0.8.0 introduces two more built-in functions, REGEXP_EXTRACT and ARRAY_LENGTH, as well as various bug fixes and other improvements. See the changelog for the complete list.

If you haven’t already, join us in our #ksqldb Confluent Community Slack channel and get started with ksqlDB today!

Victoria Xia joined the ksqlDB Team at Confluent in 2018 after completing her bachelor’s and master’s in electrical engineering and computer science at the Massachusetts Institute of Technology (MIT). Since then, she’s worked on a variety of projects spanning monitoring and alerting, performance benchmarking, security, and Confluent Cloud ksqlDB.

Did you like this blog post? Share it now

Subscribe to the Confluent blog

More Articles Like This

Announcing ksqlDB 0.10.0

We’re excited to announce the release of ksqlDB 0.10.0, available now in the standalone distribution and on Confluent Cloud! This version includes a first-class Java client, improved Apache Kafka® key […]

Unifying Streams and State: The Seamless Path to Real-Time

More than ever before, people demand immediacy in every aspect of their lives. Expectations for how we shop, bank, and commute have completely evolved over the last decade. When you […]

Real-Time Fleet Management Using Confluent Cloud and MongoDB

Most organisations maintain fleets, a collection of vehicles put to use for day-to-day operations. Telcos use a variety of vehicles including cars, vans, and trucks for service, delivery, and maintenance. […]

Sign Up Now

Start your 3-month trial. Get up to $200 off on each of your first 3 Confluent Cloud monthly bills

New signups only.

By clicking “sign up” above you understand we will process your personal information in accordance with our Privacy Policy.

By clicking "sign up" above you agree to the Terms of Service and to receive occasional marketing emails from Confluent. You also understand that we will process your personal information in accordance with our Privacy Policy.

Free Forever on a Single Kafka Broker
i

The software will allow unlimited-time usage of commercial features on a single Kafka broker. Upon adding a second broker, a 30-day timer will automatically start on commercial features, which cannot be reset by moving back to one broker.

Select Deployment Type
Manual Deployment
  • tar
  • zip
  • deb
  • rpm
  • docker
or
Auto Deployment
  • kubernetes
  • ansible

By clicking "download free" above you understand we will process your personal information in accordance with our Privacy Policy.

By clicking "download free" above, you agree to the Confluent License Agreement and to receive occasional marketing emails from Confluent. You also agree that your personal data will be processed in accordance with our Privacy Policy.

This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising, and analytics partners.