KSQL

Streaming SQL for Apache Kafka

KSQL is an open source, Apache 2.0 licensed streaming SQL engine that enables stream processing against Apache Kafka®.

KSQL makes it easy to read, write, and process streaming data in real-time, at scale, using SQL-like semantics. It offers an easy way to express stream processing transformations as an alternative to writing an application in a programming language such as Java or Python.

Currently available as a developer preview, KSQL provides powerful stream processing capabilities such as joins, aggregations, event-time windowing, and more!

Try KSQL
Currently a developer preview
Streaming SQL for Apache Kafka

KSQL: Query your streams without writing code
Enjoy real-time, fault-tolerant stream processing against Kafka today.

Get up and running with these helpful resources

WATCH THE ONLINE TALK:
STREAMING SQL FOR APACHE KAFKA

Learn how to build real-time streaming applications with KSQL. This talk explains the KSQL engine architecture, and how to design and deploy interactive, continuous queries for streaming ETL and real-time analytics.

Watch Video
WATCH THE ONLINE TALK: <br/>STREAMING SQL FOR APACHE KAFKA

Use Cases and Examples

01

Streaming ETL

Apache Kafka is a popular choice for powering data pipelines. KSQL makes it simple to transform data within the pipeline, readying messages to cleanly land in another system.

CREATE STREAM vip_actions AS 
SELECT
userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';

02

Anomaly Detection

KSQL is a good fit for identifying patterns or anomalies on real-time data. By processing the stream as data arrives you can identify and properly surface out of the ordinary events with millisecond latency.

CREATE TABLE possible_fraud AS
SELECT
card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY
card_number
HAVING count(*) > 3;

03

Monitoring

Kafka’s ability to provide scalable ordered messages with stream processing make it a common solution for log data monitoring and alerting. KSQL lends a familiar syntax for tracking, understanding, and managing alerts.

CREATE TABLE error_counts AS 
SELECT
error_code, count(*)
FROM monitoring_stream
WINDOW TUMBLING (SIZE 1 MINUTE)
WHERE
type = 'ERROR'
GROUP BY error_code;