Level Up Your Kafka Skills in Just 5 Days | Join Season of Streaming
We are excited to announce the ksqlDB 0.28.2 release as well as new cloud-specific improvements! This release simplifies the getting started experience, helps to run and monitor critical pipelines, and adds new core functionalities to the query engines such as support for pausing/resuming persistent queries and new EMIT FINAL implementation. We also added new trigonometric functions. You can see more details below to learn more about each of the changes. Additionally, the full list of updates and improvements is described in the changelog.
We are pleased to announce the following improvements to ksqlDB in Confluent Cloud, aiming at simplifying the getting started experience as well as further supporting critical production pipelines.
The Confluent Cloud ksqlDB UI now includes a topic import wizard to make it easier for you to start writing SQL queries against your Kafka data. This wizard will recognize any existing topics in your environment that have a Schema Registry schema associated with them, and then use that information to automatically generate and submit the appropriate CREATE STREAM statements. When the topic import wizard completes, you may immediately begin querying your new streams using SQL:
With the new ksqlDB REST API and corresponding Terraform support, you will be able to provision and manage ksqlDB clusters programmatically. This gives new options for deployments, increases developer productivity, and allows you to use ksqlDB as part of infrastructure-as-code pipelines. For more information on getting started with the CRUD API please view the documentation.
By using a pool of pre-warmed instances, provisioning time is consistently below one minute. Note this is only available for 1-CSU clusters at this time.
ksqlDB clusters have a new maximum size of 28 CSU (compared to 12 CSU previously), allowing support for more demanding workloads.
We added six new metrics to monitor ksqlDB in Confluent Cloud (in preview), helping to track key progress or errors for KSQL queries and clusters. The new metrics are: bytes consumed, bytes products, offset processed, offset lag, number of processing errors, and number of restarts. More information is available in the documentation. Storage and query saturation metrics are also now generally available, get more details here.
In addition to EMIT CHANGES, which provides every update of windows, we are introducing EMIT FINAL to ksqlDB, allowing you to send only a single output for each window,` keyword is used to output the final aggregation results after a window closes. Users can write queries like the example below to use it:
SELECT col1,COUNT(*) as COUNT FROM test WINDOW TUMBLING (SIZE 2 MILLISECONDS, GRACE PERIOD 1 MILLISECONDS) GROUP BY col1 EMIT FINAL;
Previously, EMIT FINAL was implemented in Kafka Streams using an in-memory store which has high risk of out-of-memory (OOM) exceptions because windowed aggregation states could be large, so it was not enabled in ksqlDB by default. In this release, ksqlDB adopted a new implementation in Kafka Streams for EMIT FINAL. This new implementation replaces the in-memory store with a disk-backed store to support EMIT FINAL while keeping high performance. By using a disk-backed store, ksqlDB EMIT FINAL queries are risk-free from OOM issues caused by the potential large states in windowed aggregations.
In this release, ksqlDB adds the ability to pause and resume persistent queries. There are a few motivations behind this new feature; one example is that users can pause a query and modify the downstream parts of a data pipeline. In addition to iterating on a data pipeline, there may be operational cases where pausing a query temporarily can help manage processing or disk resources used by a query. The syntax is similar to TERMINATE (which stops a query completely). The PAUSE and RESUME commands each take either a query_id or the ALL keyword.
In this release, ksqlDB adds fourteen new trigonometric scalar functions. There are four regular trigonometric functions, SIN(), COS(), TAN(), and COT(), as well as four inverse functions, ASIN(), ACOS(), ATAN(), and ATAN2(). In addition, there are three new hyperbolic functions, SINH(), COSH(), and TANH(). These new functions use radians, but RADIANS() and DEGREES() are available to change units as necessary. PI() is also available to retrieve the constant .
Again, thank you for using ksqlDB. Please do not hesitate to contact us with more feedback or comments! For more details about the changes, please refer to the changelog. Get started with ksqlDB today, via the standalone distribution or with Confluent, and join the community to ask questions and find new resources.
Tableflow can seamlessly make your Kafka operational data available to your AWS analytics ecosystem with minimal effort, leveraging the capabilities of Confluent Tableflow and Amazon SageMaker Lakehouse.
Building a headless data architecture requires us to identify the work we’re already doing deep inside our data analytics plane, and shift it to the left. Learn the specifics in this blog.