View sessions and slides from Kafka Summit London 2022.
Jay's keynote discussed how to harness real-time data across to help power customer experiences and internal business needs. Also featured was Avi Perez who shared how Wix.com is using event streaming to power a full 7% of the internet’s websites.
In this talk we’ll discuss our solution to this problem: an in-house proxy layer and multi-cluster toplogy which we’ve built and operated over the past 3 years. Our proxy layer enables multiple Kafka clusters to work in coordination across the globe.
In this talk, I’ll walk through the top 10 lessons learned re-imagining Kafka for the Cloud, from storage to data balancing, to scaling and security.
The journey from single-cluster deployments to multi-cluster deployments can be daunting, as you need to deal with networking configurations, security models and operational challenges. Geo-replication support for Kafka has come a long way, with both open-source and commercial solutions.
In this session, we present different aspects of the platform. We highlight the benefits of our approach - converting the complex FHIR schemas to Protobuf - compared to working directly with data in the FHIR format. We further showcase how we use Kafka Streams to integrate a multitude of sources.
This talk explores why security features such as RBAC, encryption, and audit logs are only the foundation of a secure event streaming infrastructure. Learn about use cases and architectures including situational awareness, threat intelligence, forensics, air-gapped and zero trust environments
In this talk, we will start with a quick recap of Apache Kafka’s transactions and Flink’s checkpointing mechanism. Then, we describe the two-phase commit protocol implemented in KafkaSink in-depth and emphasize the difficulties we have overcome when applying Kafka’s transaction API.
In this talk I’m going to present to you the design process behind our Data Auditing system, Life Line. From tracking and producing , to analysing and storing auditing information, using technologies such as Kafka, Avro, Spark, Lambda functions and complex SQL queries
This talk will discuss our efforts to get KRaft mode production-ready. We will talk about the old and new architectures, and how we adapted features to work in both worlds. We will also talk about our experiences with testing and deploying the new software.
In this session, you will learn about the idempotent Kafka Producer & Consumer architecture and how to automate the CI/CD process with open-source tools.
In this talk, we will cover in detail the process to contribute code to Apache Kafka, from setting up a development environment, to building the code, running tests and opening a PR. We will also look at the KIP process, describe what each section of the document is for.
Does your organization struggle with updating of its Kafka Streams application? Releasing a new version of a Kafka Streams application can be challenging, especially if its state has to be preserved between releases. Consider these best-practices and architectural ideas to make this process smoother
In this talk, we will cover how Active-Active/Active-Passive modes for disaster recovery have worked in the past and how the architecture evolves with deploying Apache Kafka on Kubernetes. We'll also look at how stretch clusters sitting on this architecture give a disaster recovery solution!
This talk will walk through how to use OpenTelemetry to tell the full story of a request as it travels through your Kafka producer, queue, and consumer. First, we will learn how context propagation works in OpenTelemetry with W3C and B3 protocols.
In this session, we describe how we overcome this problem to enable dynamic charging and rewards based on customer behaviour in a banking scenario
In this talk, we’ll work through such a transition, using Apache Kafka and Python. We’ll learn how to introduce Kafka into an architecture and then gradually use it to make our application more efficient, less coupled, and much easier to evolve.
In our talk we will explore Evergreen's architecture and share our learnings from utilizing Kafka Streams in a mission critical system.
This talk presents several strategies for dealing with geo-replicated Kafka topics in Kafka Streams applications. You'll see that it's easy to get started, but there are trade-offs to consider with each approach.
In this talk we will introduce the Connect components, from connectors, to transformations to the runtime itself. We will also share some of the new capabilities and best practices that you should be aware of to help you run and manage connectors effectively.
In this talk, Adam covers implementing a self-service data mesh with events streams in Apache Kafka®. Event streams as a data product are an essential part of a real-world data mesh, as they enable both operational and analytical workloads from a common source of truth.
This presentation will cover how standby tasks work and how they're enabled. Additionally, I'll cover the work done in KIP-441 that enables faster scaling out for stateful tasks and provides more balanced stateful assignments.
In this presentation, we unveil the next generation of Interactive Query (IQv2) that addresses all these shortcomings. We demonstrate the key benefits of the new query API.
In this session we'll understand how the JDBC source connector works and explore the various modes it can operate to load data in a bulk or incremental manner. Having covered the basics, we'll analyse the edge cases causing things to go wrong like infrequent snapshot times, out of order events.
In this talk, we’ll look at the entire streaming platform provided by Apache Kafka and the Confluent community components. Starting with a lonely key-value pair, we’ll build up topics, partitioning, replication, and low-level Producer and Consumer APIs.
This talk is about Wix's Kafka based global data architecture and platform. How we made it very easy for Wix 2000 microservices to publish and subscribe to data, no matter where they are deployed in the world, or what technological stack they use.
Join us for this session to learn how to keep read views of your data in distributed caches close to your users, always kept in sync with your primary data stores change data capture.
We'll be covering new features in Kafka versions 2.8, 3.0, and 3.1 and how to upgrade to using topic IDs. We'll see how topic IDs are used in KRaft mode and tiered storage, and take a tour through some of the internals and the thought processes around these changes.
In this talk, we will understand what we are paying for when running a self-hosted Kafka deployment, where we can cut costs, how to develop an economic mindset, and what we can proactively do to reduce our cloud infrastructure cost.
This talk will help you answer important questions for your project. You will better understand not only the architectural implications but also the effect on the productivity of your teams.
We’ll see eBPF in action applied to the Kafka world: identify Kafka consumers, producers, and brokers, see how they interact with each other and how many resources they consume. We'll even learn how to measure consumer lag without external components.
In this talk we'll revisit the quintessential video game, the Text-Based Adventure, and implement as much of it as we can in pure Kafka. We may not break the Steam sales records, but along the way we'll learn a lot about the building blocks of event systems, some interesting Kafka Streams tricks.
In this session, I’ll talk about how I ingest the data, followed by a look at the tools, including ksqlDB and Kafka Connect, that will help transform the raw data into useful information.
During this talk we will tackle how we have used Protobuf successfully with Kafka: from clients to connectors; streams to schema registry; and gitops to governance. We will go over our learnings, including how we have improved the developer experience.
We’ll examine one of our multi-petabyte scale Kafka pipelines, and go over some of the pitfalls we’ve encountered. We’ll offer solutions that alleviate those problems, and go over comparisons between the before and after . We’ll then explain why some common sense solutions do not work well.
The discussion will cover working with Schema Registry from the command line, how to leverage it with Kafka clients, and the supported serialization formats. Some established build tools that make life easier for the Kafka developer will also be covered.
During this demo-driven talk, you will learn how to benefit from a configurable single message transformation that lets you perform encryption and decryption operations in Kafka Connect worker nodes without any custom code.
This session is targeted for developers who are interested in learning event streaming practices. Demo application code will be available to participants.
This talk explores how we efficiently handle these stream updates and deletions in consecutive joins with Kafka Streams. Furthermore, we present an optimization for the aggregate operation in Kafka Streams, leveraging state stores to handle updates in complex aggregates.
n this session, Viktor talks about Testcontainers, a library (that was initially created for JVM, now exists in many languages) that provides lightweight, disposable instances of shared databases, clusters, and anything else that can run in a Docker container!
This talk will dive into the journey you must take in order to reach your ultimate goal making Kafka the commodity all your development teams run on.
We’ll tell the story of skews and anomalies in CPU and disk metrics - drawing graphs and conclusions. Understand how compacted topics, partitions distribution, and RAM can affect your cluster’s performance. Finally, look at how a small configuration drift can rattle your cluster.
Attend this session to learn about how we went about the migration and issues faced and how this will power our next generation data platform. We will discuss how we overcame the following challenges and more.
Kafka Streams developers will take away from this talk an understanding of how to utilize ModularTopologies, and dynamically upgrade their Kafka Streams workload effectively.
This talk will explore all these various building blocks in Spring and show the differences between them. Along the journey, we will demonstrate how Spring makes it easier for developers to build powerful applications using Apache Kafka and Kafka Streams.
The presentation highlights the main technical challenges Radicalbit faced while building a real-time serving engine for streaming Machine Learning algorithms. The speech describes how Kafka has been used to fasten two ML technologies together.
In this lightning talk we'll compare approaches offered by each of the aforementioned frameworks, and see how they stack up against Spring Boot in common use cases like consumers, producers and streams.
This talk will dive into the details of calculating the end to end latency for our real time ingestion pipeline.
In this talk, we'll discuss techniques for augmenting Kafka Connect's built-in JMX metrics with your own custom metrics.
This lightning talk introduces you to configuring and running the REST admin server, constructing administration commands, as well as best practices around producing and consuming messages using the REST API.
In simple terms, a webhook is an API request that sends data to a receiver in an unidirectional manner, without expecting any response. It is typically used to notify a system when one or more events have taken place.
In this talk we’ll see what is eventual consistency and where strong consistency is lost while moving data from a database to Kafka, describe different solutions to preserve consistency working at the source level.
In this presentation, we will use JDBC source and sink connectors as examples of how to tune source/sink connectors.
In this presentation we will see how to use Kafka Observer feature to address this challenge with additional tweak to distribute load evenly among Observers and ordinary Brokers and make them floating between data-centers.
Configuring business logic of Kafka based applications could be tricky.
I've seen two solutions based on the elegant idea of putting configs into a compact topic. However, the devil is in the details, and I'd like to share some nuances, that we learned during operationalization of this approach.
I will talk about how we can make observability of the pipeline a reality, and how central message brokers fit into that design.
This session intends to explain as Trendyol Tech, how we track user session information using Apache Kafka, ksqlDB and Debezium.
In this Lightning Talk, we will discuss how Kafka can help you to gather data from different places and persist them to a database to be monitored in a Grafana dashboard.
In this short presentation, I will talk about a pattern based on three topics: operational, retry, and DLQ, and how it can be handled programmatically.
This lightning talk starts with a demo of how you can conveniently fulfill basic tasks such as listing and consuming topics using Streampunk in the Python interpreter. After that, I'll lead you through real-life examples of increasingly difficult challenges.
In 10 minutes you’ll learn all the basics of Flink over Kafka: starting by defining the types of connectors, we’ll explore how to work with various data formats, using pre-defined schemas when appropriate, and storing the pipeline output as standard or compacted topic when needed.