KSQL

Streaming SQL for Apache Kafka

KSQL is an open source, Apache 2.0 licensed streaming SQL engine that enables stream processing against Apache KafkaTM.

KSQL makes it easy to read, write, and process streaming data in real-time, at scale, using SQL-like semantics. It offers an easy way to express stream processing transformations as an alternative to writing an application in a programming language such as Java or Python.

Currently available as a developer preview, KSQL provides powerful stream processing capabilities such as joins, aggregations, event-time windowing, and more!

Try KSQL
Currently a developer preview
Streaming SQL for Apache Kafka

KSQL: Query your streams without writing code
Enjoy real-time, fault-tolerant stream processing against Kafka today.

Get up and running with these helpful resources

WATCH THE ONLINE TALK:
STREAMING SQL FOR APACHE KAFKA

Learn how to build real-time streaming applications with KSQL. This talk explains the KSQL engine architecture, and how to design and deploy interactive, continuous queries for streaming ETL and real-time analytics.

Watch Video
WATCH THE ONLINE TALK: <br/>STREAMING SQL FOR APACHE KAFKA

Use Cases and Examples

01

Streaming ETL

Apache Kafka is a popular choice for powering data pipelines. KSQL makes it simple to transform data within the pipeline, readying messages to cleanly land in another system.

CREATE STREAM vip_actions AS 
SELECT
userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';

02

Anomaly Detection

KSQL is a good fit for identifying patterns or anomalies on real-time data. By processing the stream as data arrives you can identify and properly surface out of the ordinary events with millisecond latency.

CREATE TABLE possible_fraud AS
SELECT
card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY
card_number
HAVING count(*) > 3;

03

Monitoring

Kafka’s ability to provide scalable ordered messages with stream processing make it a common solution for log data monitoring and alerting. KSQL lends a familiar syntax for tracking, understanding, and managing alerts.

CREATE TABLE error_counts AS 
SELECT
error_code, count(*)
FROM monitoring_stream
WINDOW TUMBLING (SIZE 1 MINUTE)
WHERE
type = 'ERROR'
GROUP BY error_code;