Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now

Online Talk

Stream Processing Fundamentals: A Confluent Online Talk Series

Watch Now

Available On-Demand

What is stream processing, and how does it work?

Stream processing is a data processing technology used to collect, store, and manage continuous streams of data as it’s produced or received. Also known as event streaming or complex event processing (CEP), stream processing has grown exponentially in recent years due to its powerful ability to simplify data architectures, provide real-time insights and analytics, and the ability to react to time-sensitive data like IoT data, multiplayer video games, or location-based applications as it happens.

Today, stream processing is often the backend process for everything from billing, fulfillment and fraud detection, to Netflix recommendations and ride-share apps like Lyft, which may need to be decoupled from the frontend where users expect instant results with the click of a button. Apache Kafka® has become a de-facto standard for ingesting event-based data and is considered the central nervous system for data in many organizations.

In this three-part online talk series, we will cover everything you need to know about stream processing, including:

  • The benefits of event stream processing with Apache Kafka and how it works
  • Why stream processing is important in today's data-driven world
  • An introduction to the two most common ways to get started with stream processing in Kafka
    • The Kafka Streams API, an open source client library for building stream processing applications
    • Confluent ksqlDB, an event streaming database purpose-built to help developers create stream processing applications on top of Apache Kafka.

Part 1: How Stream Processing Works: Basic Concepts of Streaming

The event-driven model provides many benefits: It decouples dependencies between services, provides some level of pluggability to the architecture, and enables services to evolve independently.

Such systems typically use Apache Kafka as the foundation. Kafka is like a central dataplane that holds shared events and keeps services in sync. Its distributed cluster technology provides availability, resiliency and performance properties that strengthen the architecture, leaving the programmer to simply write and deploy client applications that will run load balanced and be highly available.

This session will cover the use of Apache Kafka as a platform for streaming data and how stream processing can make your data systems more flexible and less complex.

Register now to learn:

  • Advantages of event stream processing over batch processing
  • Stream processing use cases
  • Compare and contrast Kafka Streams and ksqlDB at a high level
  • Explain basic stream processing concepts

Part 2: Stream Processing with Kafka Streams

An event streaming platform would not be complete without the ability to manipulate that data as it arrives. The Streams API within Apache Kafka is a powerful, lightweight library that allows for on-the-fly processing, letting you aggregate, create windowing parameters, perform joins of data within a stream, and more. Perhaps best of all, it is built as a Java application on top of Kafka, keeping your workflow intact with no extra clusters to maintain.

Register now to learn:

  • Describe the purpose of Kafka Streams
  • Understand how Kafka Streams integrates with your applications
  • Explain the features of Kafka Streams
  • Describe an application using the Streams DSL (Domain-Specific Language)

Part 3: Introduction to ksqlDB

You’ve got streams of data that you want to process and store? You’ve got events from which you’d like to derive state or build aggregates? And you want to do all of this in a scalable and fault-tolerant manner? It’s just as well that Kafka and ksqlDB exist!

ksqlDB enables you to build event streaming applications with the same ease and familiarity of building traditional applications on a relational database. It also simplifies the underlying architecture for these applications so you can build powerful, real-time systems with just a few SQL statements.

This talk will cover the concepts and capabilities of ksqlDB. We’ll show how you can apply transformations to a stream of events from one Kafka topic to another. We’ll discuss using ksqlDB connectors to bring in data from other systems and use that data to join and enrich streams.

Register now to learn:

  • What is ksqlDB and how does it work?
  • ksqlDB use cases, architecture and components
  • How to process streams of events
  • The semantics of streams and tables, and of push and pull queries
  • How to use the ksqlDB API to get state directly from the materialised store
  • What makes ksqlDB elastically scalable and fault-tolerant