See Stream Designer in action and build pipelines in minutes | Register for demo

Kafka Summit London 2022

View sessions and slides from Kafka Summit London 2022.

Keynotes

Modern Data Flow: Data Pipelines Done Right

  • Jay Kreps, Confluent
  • Avi Perez, Wix.com
  • Amit Gupta, Confluent

Jay's keynote discussed how to harness real-time data across to help power customer experiences and internal business needs. Also featured was Avi Perez who shared how Wix.com is using event streaming to power a full 7% of the internet’s websites.

Breakout Sessions

6 Nines: How Stripe keeps Kafka highly-available across the globe

  • Donny Nadolny, Stripe

In this talk we’ll discuss our solution to this problem: an in-house proxy layer and multi-cluster toplogy which we’ve built and operated over the past 3 years. Our proxy layer enables multiple Kafka clusters to work in coordination across the globe.

10 Things We Learned Re-imagining Kafka for the Cloud

  • Addison Huddy, Confluent

In this talk, I’ll walk through the top 10 lessons learned re-imagining Kafka for the Cloud, from storage to data balancing, to scaling and security.

A Hitchhiker's Guide to Apache Kafka Geo-Replication

  • Sanjana Kaundinya, Confluent
  • Rajini Sivaram, Confluent

The journey from single-cluster deployments to multi-cluster deployments can be daunting, as you need to deal with networking configurations, security models and operational challenges. Geo-replication support for Kafka has come a long way, with both open-source and commercial solutions.

A Kafka-based platform to process medical prescriptions of Germany’s health insurance system

  • Torben Meyer, bakdata GmbH
  • Janne Austermann, spectrumK GmbH
  • Antja Buksch, spectrumK GmbH

In this session, we present different aspects of the platform. We highlight the benefits of our approach - converting the complex FHIR schemas to Protobuf - compared to working directly with data in the FHIR format. We further showcase how we use Kafka Streams to integrate a multitude of sources.

Apache Kafka as the Backbone for Cybersecurity

  • Kai Waehner , Confluent

This talk explores why security features such as RBAC, encryption, and audit logs are only the foundation of a secure event streaming infrastructure. Learn about use cases and architectures including situational awareness, threat intelligence, forensics, air-gapped and zero trust environments

Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink in Apache Flink

  • Fabian Paul, Ververica

In this talk, we will start with a quick recap of Apache Kafka’s transactions and Flink’s checkpointing mechanism. Then, we describe the two-phase commit protocol implemented in KafkaSink in-depth and emphasize the difficulties we have overcome when applying Kafka’s transaction API.

Auditing your data and answering the life long question, is it the end of the day yet?

  • Simona Meriam, Aidoc

In this talk I’m going to present to you the design process behind our Data Auditing system, Life Line. From tracking and producing , to analysing and storing auditing information, using technologies such as Kafka, Avro, Spark, Lambda functions and complex SQL queries

Bringing Kafka Without Zookeeper Into Production

  • Colin McCabe, Confluent

This talk will discuss our efforts to get KRaft mode production-ready. We will talk about the old and new architectures, and how we adapted features to work in both worlds. We will also talk about our experiences with testing and deploying the new software.

CI/CD with an Idempotent Kafka Producer & Consumer

  • Eden Ohana, Treeverse

In this session, you will learn about the idempotent Kafka Producer & Consumer architecture and how to automate the CI/CD process with open-source tools.

Developer’s guide to contributing code to Kafka

  • Mickael Maison, IBM
  • Tom Bentley, Red Hat

In this talk, we will cover in detail the process to contribute code to Apache Kafka, from setting up a development environment, to building the code, running tests and opening a PR. We will also look at the KIP process, describe what each section of the document is for.

Developing Kafka Streams Applications with Upgradability in Mind

  • Neil Buesing, Rill Data

Does your organization struggle with updating of its Kafka Streams application? Releasing a new version of a Kafka Streams application can be challenging, especially if its state has to be preserved between releases. Consider these best-practices and architectural ideas to make this process smoother

Disaster Recovery Options Running Apache Kafka in Kubernetes

  • Rema Subramanian, Confluent
  • Jennifer Snipes, Confluent

In this talk, we will cover how Active-Active/Active-Passive modes for disaster recovery have worked in the past and how the architecture evolves with deploying Apache Kafka on Kubernetes. We'll also look at how stretch clusters sitting on this architecture give a disaster recovery solution!

Distributed Tracing for Kafka with OpenTelemetry

  • Daniel Kim, New Relic

This talk will walk through how to use OpenTelemetry to tell the full story of a request as it travels through your Kafka producer, queue, and consumer. First, we will learn how context propagation works in OpenTelemetry with W3C and B3 protocols.

Enabling product personalisation using Apache Kafka, Apache Pinot and Trino

  • Stuart Coleman, 10x Banking

In this session, we describe how we overcome this problem to enable dynamic charging and rewards based on customer behaviour in a banking scenario

Event-driven Microservices with Python and Apache Kafka

  • Dave Klein , Confluent

In this talk, we’ll work through such a transition, using Apache Kafka and Python. We’ll learn how to introduce Kafka into an architecture and then gradually use it to make our application more efficient, less coupled, and much easier to evolve.

Evergreen: Building Airbnb’s Merge Queue With Kafka Streams

  • Janusz Kudelka, Airbnb
  • Joel Snyder, Airbnb

In our talk we will explore Evergreen's architecture and share our learnings from utilizing Kafka Streams in a mission critical system.

Geo-replicated Kafka Streams Apps

  • Ryanne Dolan , Twitter

This talk presents several strategies for dealing with geo-replicated Kafka topics in Kafka Streams applications. You'll see that it's easy to get started, but there are trade-offs to consider with each approach.

Getting up to speed with Kafka Connect: from the basics to the latest features

  • Kate Stanley, IBM
  • Mickael Maison, IBM

In this talk we will introduce the Connect components, from connectors, to transformations to the runtime itself. We will also share some of the new capabilities and best practices that you should be aware of to help you run and manage connectors effectively.

Implementing a Data Mesh with Apache Kafka

  • Adam Bellemare, Confluent

In this talk, Adam covers implementing a self-service data mesh with events streams in Apache Kafka®. Event streams as a data product are an essential part of a real-world data mesh, as they enable both operational and analytical workloads from a common source of truth.

Improving fault tolerance and scaling out in Kafka Streams

  • Bill Bejeck, Confluent

This presentation will cover how standby tasks work and how they're enabled. Additionally, I'll cover the work done in KIP-441 that enables faster scaling out for stateful tasks and provides more balanced stateful assignments.

Interactive Query in Kafka Streams: The Next Generation

  • Vasiliki Papavasileiou, Confluent
  • John Roesler, Confluent

In this presentation, we unveil the next generation of Interactive Query (IQv2) that addresses all these shortcomings. We demonstrate the key benefits of the new query API.

JDBC Source Connector: What could go wrong?

  • Francesco Tisiot, Aiven

In this session we'll understand how the JDBC source connector works and explore the various modes it can operate to load data in a bulk or incremental manner. Having covered the basics, we'll analyse the edge cases causing things to go wrong like infrequent snapshot times, out of order events.

Kafka as a Platform: the Ecosystem from the Ground Up

  • Robin Moffatt, Confluent

In this talk, we’ll look at the entire streaming platform provided by Apache Kafka and the Confluent community components. Starting with a lonely key-value pair, we’ll build up topics, partitioning, replication, and low-level Producer and Consumer APIs.

Kafka based Global Data Mesh at Wix

  • Natan Silinitsky, wix.com

This talk is about Wix's Kafka based global data architecture and platform. How we made it very easy for Wix 2000 microservices to publish and subscribe to data, no matter where they are deployed in the world, or what technological stack they use.

Keep Your Cache Always Fresh with Debezium!

  • Gunnar Morling, Red Hat

Join us for this session to learn how to keep read views of your data in distributed caches close to your users, always kept in sync with your primary data stores change data capture.

Know Your Topics – A Deep Dive on Topic IDs with KIP-516

  • Justine Olshan, Confluent

We'll be covering new features in Kafka versions 2.8, 3.0, and 3.1 and how to upgrade to using topic IDs. We'll see how topic IDs are used in KRaft mode and tiered storage, and take a tour through some of the internals and the thought processes around these changes.

Let’s Make Your CFO Happy; A Practical Guide for Kafka Cost Reduction

  • Elad Leev, AppsFlyer

In this talk, we will understand what we are paying for when running a self-hosted Kafka deployment, where we can cut costs, how to develop an economic mindset, and what we can proactively do to reduce our cloud infrastructure cost.

Loosely or lousily coupled? Understanding communication patterns in modern architectures

  • Thomas Heinrichs, Camunda

This talk will help you answer important questions for your project. You will better understand not only the architectural implications but also the effect on the productivity of your teams.

Monitoring Kafka without instrumentation using eBPF

  • Antón Rodríguez, New Relic

We’ll see eBPF in action applied to the Kafka world: identify Kafka consumers, producers, and brokers, see how they interact with each other and how many resources they consume. We'll even learn how to measure consumer lag without external components.

PICKUP DATA - A Kafka Adventure Game with Kris Jenkins

  • Kris Jenkins, Confluent

In this talk we'll revisit the quintessential video game, the Text-Based Adventure, and implement as much of it as we can in pure Kafka. We may not break the Steam sales records, but along the way we'll learn a lot about the building blocks of event systems, some interesting Kafka Streams tricks.

Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB

  • Danica Fine, Confluent

In this session, I’ll talk about how I ingest the data, followed by a look at the tools, including ksqlDB and Kafka Connect, that will help transform the raw data into useful information.

Reigning in Protobuf

  • David Navalho, Marionete
  • Graham Stirling, Saxo Bank

During this talk we will tackle how we have used Protobuf successfully with Kafka: from clients to connectors; streams to schema registry; and gitops to governance. We will go over our learnings, including how we have improved the developer experience.

Scaling your Kafka streaming pipeline can be a pain - but it doesn’t have to be!!

  • Opher Dubrovsky, Nielsen
  • Ido Nadler, Nielsen

We’ll examine one of our multi-petabyte scale Kafka pipelines, and go over some of the pitfalls we’ve encountered. We’ll offer solutions that alleviate those problems, and go over comparisons between the before and after . We’ll then explain why some common sense solutions do not work well.

Schema Registry 101

  • Bill Bejeck, Confluent

The discussion will cover working with Schema Registry from the command line, how to leverage it with Kafka clients, and the supported serialization formats. Some established build tools that make life easier for the Kafka developer will also be covered.

Securing Kafka Connect Pipelines with Client-Side Field-Level Cryptography

  • Hans-Peter Grahsl, NETCONOMY

During this demo-driven talk, you will learn how to benefit from a configurable single message transformation that lets you perform encryption and decryption operations in Kafka Connect worker nodes without any custom code.

Stateful Microservices with Apache Kafka and Spring Cloud Stream

  • Jan Svoboda, Confluent

This session is targeted for developers who are interested in learning event streaming practices. Demo application code will be available to participants.

Streaming Updates through Complex Operations in Kafka Streams at Scale

  • Victor Künstler, bakdata GmbH

This talk explores how we efficiently handle these stream updates and deletions in consecutive joins with Kafka Streams. Furthermore, we present an optimization for the aggregate operation in Kafka Streams, leveraging state stores to handle updates in complex aggregates.

Testing Kafka containers with Testcontainers: There and back again

  • Viktor Gamov , Kong

n this session, Viktor talks about Testcontainers, a library (that was initially created for JVM, now exists in many languages) that provides lightweight, disposable instances of shared databases, clusters, and anything else that can run in a Docker container!

The Age of the Clusters: Offering Kafka as a Service in Your Organisation

  • Sion Smith, OSO

This talk will dive into the journey you must take in order to reach your ultimate goal making Kafka the commodity all your development teams run on.

The Details That Matter: Kafka in Production, at Scale

  • Or Arnon, ironSource
  • Elad Eldor, ironSource

We’ll tell the story of skews and anomalies in CPU and disk metrics - drawing graphs and conclusions. Understand how compacted topics, partitions distribution, and RAM can affect your cluster’s performance. Finally, look at how a small configuration drift can rattle your cluster.

Trials, Tribulations, and Triumphs: Migrating from Self-Managed Kafka to Managed Kafka in the Cloud

  • Adrian Sibilla, Compare the Market
  • Dewi Rees, Compare the Market

Attend this session to learn about how we went about the migration and issues faced and how this will power our next generation data platform. We will discuss how we overcame the following challenges and more.

Using Modular Topologies in Kafka Streams to scale ksqlDB’s persistent queries

  • A. Sophie Blee-Goldman, Confluent
  • Walker Carlson, Confluent

Kafka Streams developers will take away from this talk an understanding of how to utilize ModularTopologies, and dynamically upgrade their Kafka Streams workload effectively.

Walking through the Spring Stack for Apache Kafka

  • Soby Chacko, VMware

This talk will explore all these various building blocks in Spring and show the differences between them. Along the journey, we will demonstrate how Spring makes it easier for developers to build powerful applications using Apache Kafka and Kafka Streams.

Lightning Talks

3 Kafka patterns to deliver Streaming Machine Learning models

  • Andrea Spina, Radicalbit srl

The presentation highlights the main technical challenges Radicalbit faced while building a real-time serving engine for streaming Machine Learning algorithms. The speech describes how Kafka has been used to fasten two ML technologies together.

Apache Kafka in the era of Java Microframeworks

  • Marcin Mergo, Consdata

In this lightning talk we'll compare approaches offered by each of the aforementioned frameworks, and see how they stack up against Spring Boot in common use cases like consumers, producers and streams.

Calculating the End-to-End Latency of Realtime Ingestion

  • Minakshi Korad, Twilio

This talk will dive into the details of calculating the end to end latency for our real time ingestion pipeline.

Custom Metrics for Kafka Connectors

  • Sarah Story, Zapier

In this talk, we'll discuss techniques for augmenting Kafka Connect's built-in JMX metrics with your own custom metrics.

Did you know that there is a free REST administration server for Kafka?

  • Emma Humber, Confluent

This lightning talk introduces you to configuring and running the REST admin server, constructing administration commands, as well as best practices around producing and consuming messages using the REST API.

Expose your event-driven data to the outside world using webhooks powered by Kafka

  • Saud Alhelali, Arab National Bank

In simple terms, a webhook is an API request that sends data to a receiver in an unidirectional manner, without expecting any response. It is typically used to notify a system when one or more events have taken place.

Handling eventual consistency in a transactional world

  • Matteo Cimini, Quantyca
  • Andrea Gioia, Quantyca

In this talk we’ll see what is eventual consistency and where strong consistency is lost while moving data from a database to Kafka, describe different solutions to preserve consistency working at the source level.

Increasing Kafka Connect Throughput

  • Catalin Pop, Confluent

In this presentation, we will use JDBC source and sink connectors as examples of how to tune source/sink connectors.

Kafka High Availability in multi data center setup with floating Observers

  • Dalibor Blazevic
  • Phong Pham, Sysco AS

In this presentation we will see how to use Kafka Observer feature to address this challenge with additional tweak to distribute load evenly among Observers and ordinary Brokers and make them floating between data-centers.

Keeping configs in Kafka compact topic

  • Eli Shvartsman, Forto

Configuring business logic of Kafka based applications could be tricky.

I've seen two solutions based on the elegant idea of putting configs into a compact topic. However, the devil is in the details, and I'd like to share some nuances, that we learned during operationalization of this approach.

Messaging for build observability

  • Jack Grahl, Deutsche Bank

I will talk about how we can make observability of the pipeline a reality, and how central message brokers fit into that design.

Monitoring User Session Statistics with Apache Kafka and ksqlDB

  • Cemalettin Kaya, Trendyol
  • Anıl Doğan, Trendyol

This session intends to explain as Trendyol Tech, how we track user session information using Apache Kafka, ksqlDB and Debezium.

Monitoring Your Business Metrics With Kafka + Grafana

  • Eduardo Boccato, ABInBev (BEES)

In this Lightning Talk, we will discuss how Kafka can help you to gather data from different places and persist them to a database to be monitored in a Grafana dashboard.

Reliable Event Delivery in Apache Kafka Based on Retry Policy and Dead Letter Topics

  • Jacek Grobelny, Consdata

In this short presentation, I will talk about a pattern based on three topics: operational, retry, and DLQ, and how it can be handled programmatically.

Streampunk - The Difference Engine for Unlocking the Kafka Black Box

  • Ralph M. Debusmann, Forecasty.ai

This lightning talk starts with a demo of how you can conveniently fulfill basic tasks such as listing and consuming topics using Streampunk in the Python interpreter. After that, I'll lead you through real-life examples of increasingly difficult challenges.

Tips for Apache Flink on Kafka

  • Olena Babenko, Aiven

In 10 minutes you’ll learn all the basics of Flink over Kafka: starting by defining the types of connectors, we’ll explore how to work with various data formats, using pre-defined schemas when appropriate, and storing the pipeline output as standard or compacted topic when needed.