View sessions and slides from Kafka Summit London 2019
The rise of Apache Kafka as the de facto standard for event streaming has coincided with the rise of Kubernetes for cloud-native applications. While Kubernetes is a great choice for any distributed system, that doesn’t mean it is easy to deploy and maintain a Kafka cluster running on it. At IBM we have hands-on experience with running Kafka in Kubernetes and in this session I will share our top tips for a smooth ride. I will show an example deployment of Kafka on Kubernetes and step through the system to explain the common pitfalls and how to avoid them. This will include the Kubernetes objects to use, resource considerations and connecting applications to the cluster. Finally, I will discuss useful Kafka metrics to include in Kubernetes liveness and readiness probes.
Kafka, many times is just a piece of the stack that lives in production that often times no one wants to touch – because it just works. At AppsFlyer, a mobile attribution and analysis platform that generates a constant “storm” of 70B+ events (HTTP Requests) daily, Kafka sits at the core of our infrastructure. Recently I inherited the daunting task of managing our Kafka operation and discovered a lot of technical debt we needed to recover from if we wanted to be able sustain our next phase of growth. This talk will dive into how to safely migrate from outdated versions, how to gain trust with developers to migrate their production services, how to manage and monitor the right metrics and build resiliency into the architecture, as well as how to plan for continued improvements through paradigms such as sleep-driven design, and much more.
Running Apache Kafka sometimes presents interesting challenges, especially when operating at scale. In this talk we share some of our experiences operating Apache Kafka as a service across a large company. What happens when you create a lot of partitions and then need to restart brokers? What if you find yourself with a need to reassign almost all partitions in all of your clusters? How do you track progress on large-scale reassignments? How do you make sure that moving data between nodes in a cluster does not impact producers and consumers connected to the cluster? We invite you to dive into a few of the issues we have encountered and share debugging and mitigation strategies.
Apache Kafka® is a scalable streaming platform with built-in dynamic client scaling. The elastic scale-in/scale-out feature leverages Kafka’s “rebalance protocol” that was designed in the 0.9 release and improved ever since then. The original design aims for on-prem deployments of stateless clients. However, it does not always align with modern deployment tools like Kubernetes and stateful stream processing clients, like Kafka Streams. Those shortcoming lead to two mayor recent improvement proposals, namely static group membership and incremental rebalancing (which will hopefully be available in version 2.3). This talk provides a deep dive into the details of the rebalance protocol, starting from its original design in version 0.9 up to the latest improvements and future work. We discuss internal technical details, pros and cons of the existing approaches, and explain how you configure your client correctly for your use case. Additionally, we discuss configuration tradeoffs for stateless, stateful, on-prem, and containerized deployments.
Kafka is notoriously tricky for multi-dc use cases. The log abstraction and client failover breaks down when you cannot at least guarantee offset consistency. In this talk, we define the current state of Kafka in terms of multi-dc usage, how different approaches provide different guarantees as well as examining the missing gaps, and how the community is addressing them.
Apache Kafka is now nearly ubiquitous in modern data pipelines and use cases. While the Kafka development model is elegantly simple, operating Kafka clusters in production environments is a challenge. It’s hard to troubleshoot misbehaving Kafka clusters, especially when there are potentially hundreds or thousands of topics, producers and consumers and billions of messages.
The root cause of why real-time applications is lag may be due to an application problem – like poor data partitioning or load imbalance – or due to a Kafka problem – like resource exhaustion or suboptimal configuration. Therefore getting the best performance, predictability, and reliability for Kafka-based applications can be difficult. In the end, the operation of your Kafka powered analytics pipelines could themselves benefit from machine learning (ML).
We recently learned about “Fault Tree Analysis” and decided to apply the technique to bulletproof our Apache Kafka deployments. In this talk, learn about fault tree analysis and what you should focus on to make your Apache Kafka clusters resilient.
This talk should provide a framework for answers the following common questions a Kafka operator or user might have:
What guarantees can I promise my users?
What should my replication factor?
What should the ISR setting be?
Should I use RAID or not?
Should I use external storage such as EBS or local disks?
Increasingly, organizations are relying on Kafka for mission critical use-cases where high availability and fast recovery times are essential. In particular, enterprise operators need the ability to quickly migrate applications between clusters in order to maintain business continuity during outages. In many cases, out-of-order or missing records are entirely unacceptable. MirrorMaker is a popular tool for replicating topics between clusters, but it has proven inadequate for these enterprise multi-cluster environments. Here we present MirrorMaker 2.0, an upcoming all-new replication engine designed specifically to provide disaster recovery and high availability for Kafka. We describe various replication topologies and recovery strategies using MirrorMaker 2.0 and associated tooling.
"Deploying stateful production applications in Kubernetes, such as Kafka, is often seen as ill-advised. The arguments are that it’s easy to get wrong, requires learning new skills, is too risky for unclear gains, or that Kubernetes is simply too young a project. This does not have to be true, and we will explain why. Datadog having made the choice to migrate its entire infrastructure to Kubernetes, my team was tasked with deploying reliable, production-ready Kafka clusters.
This talk will go over our deployment strategy, lessons learned, describe the challenges we faced along the way, as well as the reliability benefits we have observed.
This presentation will go through:
– an introduction to the tools and practices establised by Datadog
– a brief introduction of Kubernetes and associated concepts
– a deep dive into the deployment and bootstrap strategy of a production-bearing Kafka cluster in Kubernetes
– a walkthrough of some routine operations in a Kubernetes-based Kafka cluster"
In the Apache Kafka world, there is such a great diversity of open source tools available (I counted over 50!) that it’s easy to get lost. Over the years I have dealt with Kafka, I have learned to particularly enjoy a few of them that save me a tremendous amount of time over performing manual tasks. I will be sharing my experience and doing live demos of my favorite Kafka tools, so that you too can hopefully increase your productivity and efficiency when managing and administering Kafka. Come learn about the latest and greatest tools for CLI, UI, Replication, Management, Security, Monitoring, and more!
At Babylon, we believe it is possible to put an accessible and affordable health service in the hands of every person on earth. One of the key components to enable this health revolution is our near real-time streaming platform. We built a streaming platform based on Kafka Streams which transform all events generated by all Babylon micro-services based on Apache Kafka, where each data transformation is handled in near real-time by Kafka Streams. Kafka keys are key to the efficient operation of Kafka and affect partitioning, ordering and indirectly the ability to alter or remove messages. In simpler domains where a simple partition key works, the process is quite simple. However, at Babylon, we now live in a post GDPR world and have the requirement such that messages must: be deletable Able to be corrected or amended. and be consumed in the correct order according to the subject of the event we find that we need a more sophisticated strategy to handle this. At Babylon, we’ve been heavily using the partition strategy functionality of Kafka, along with protobuf and a hierarchical key strategy to fulfil these requirements, along with some some unexpected benefits that we get from this approach. Key Takeaways The partition strategy is a powerful tool and should not be shy-ed away from The challenges and benefits of such an approach Sometimes seemingly crazy ideas work out well
Is your Kafka cluster keeping you up all night? URP driving you crazy? Do you want all the benefits of running Kafka as a shared data hub, but the thought of supporting that many clients on the same infrastructure has you spooked? We felt the same way, but it turned out Kafka can indeed be run at scale in a multi-tenant environment with the right mix of hardware, configuration, and filesystem sorcery. As we all know, a well rested administrator is a happy (less cranky) administrator.
Integrating Apache Kafka with other systems in a reliable and scalable way is often a key part of a streaming platform. Fortunately, Apache Kafka includes the Connect API that enables streaming integration both in and out of Kafka. Like any technology, understanding its architecture and deployment patterns is key to successful use, as is knowing where to go looking when things aren’t working. This talk will discuss the key design concepts within Kafka Connect and the pros and cons of standalone vs distributed deployment modes. We’ll do a live demo of building pipelines with Kafka Connect for streaming data in from databases, and out to targets including Elasticsearch. With some gremlins along the way, we’ll go hands-on in methodically diagnosing and resolving common issues encountered with Kafka Connect. The talk will finish off by discussing more advanced topics including Single Message Transforms, and deployment of Kafka Connect in containers.
We all know Kafka is designed to allow applications to produce and consume data with high throughput and low latency, right? In practice, achieving this goal requires some tuning. Starting with a consideration of design principles and best practices for distributed applications, we’ll explore various practical tips to improve your client application’s performance. We’ll look first at the most important producer and consumer configuration options because these apply even when you have limited control of the brokers’ cluster setup. Secondly, we’ll look at how the brokers’ cluster deployment, topology and configuration could help further. Finally a word on some gotchas of performance analysis, that apply to Kafka too!
Serverless (also known as function-as-a-service) is fast emerging as an effective architecture for event-driven applications. Apache OpenWhisk is one of the more popular open-source cloud serverless platforms, and has first-className support for Kafka as a source of events. Come to this session for an introduction to building microservices without servers using OpenWhisk. Ill describe the challenges to building applications using serverless stacks, and the serverless design patterns to help you get started. Ill give a demonstration of how you can use Kafka Connect to invoke serverless actions, and how serverless can be an effective way to host event-processing logic.
I will demonstrate how the new generation of lightweight and highly-scalable state machines ease the implementation of long running services. Based on my real-life experiences, I will share how to handle complex logic and flows which require proper reactions on failures, timeouts and compensating actions and provide guidance backed by code examples to illustrate alternative approaches.
With CQRS rising and more formal event-sourcing solutions increasing in adoption, event-storming is demonstrated as a powerful technique for event-driven development of microservices in the enterprise. With such an approach, the elusive, but powerful promise of the ubiquitous language from DDD finally emerges from EDD (event driven development). In this talk, the audience will journey through a “hello-streams” github project. This journey embarks with a specific emphasis on the stream first mindset. Problems with current solutions are introduced, a high level overview of event storming is presented, then the talk transitions through an opinionated version of event storming through a classNameic, simple project (e.g. hello-streams). The project is a simple UI built on a stream-based coffee service. The stream first mindset entails, event sourcing, command events as first className citizens, storing those events in an event store, aggregating those command events into state called domain events, and then further enriching those domain events into business events used to report/monitor the overall health of the business — all in real time.
Github Open Source project here: https://github.com/homeaway/hello-streams
Eventing and streaming open a world of compelling new possibilities to our software and platform designs. They can reduce time to decision and action while lowering total platform cost. But they are not a panacea. Understanding the edges and limits of these architectures can help you avoid painful missteps. This talk will focus on event driven and streaming architectures and how Apache Kafka can help you implement these. It will also discuss key tradeoffs you will face along the way from partitioning schemes to the impact of availability vs. consistency (CAP Theorem). Finally we’ll discuss some challenges of scale for patterns like Event Sourcing and how you can use other tools and even features of Kafka to work around them. This talk assumes a basic understanding of Kafka and distributed computing, but will include brief refresher sections.
There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton
Today’s microservice-based architectures rely on caches to hide latency and provide more responsive experiences for users. Unfortunately, these caches can be tricky to get right. Who places the data in the cache? When should the cached data expire? What happens when the cache or its datastore fail? We found that by using an event-based push model, we could avoid most of the pitfalls associated with traditional caches. This talk will cover the basic concept of push-based caches and their implications. It will delve into how you might build such a cache and what to do when your dataset is large, as well as look at our experience using these kinds of caches in production at Bloomberg.
Have you ever imagined what it would be like to build a massively scalable streaming application on Kafka, the challenges, the patterns and the thought process involved? How much of the application can be reused? What patterns will you discover? How does it all fit together? Depending upon your use case and business, this can mean many things. Starting out with a data pipeline is one thing, but evolving into a company-wide real-time application that is business critical and entirely dependent upon a streaming platform is a giant leap. Large-scale streaming applications are also called event streaming applications. They are classNameically different from other data systems; event streaming applications are viewed as a series of interconnected streams that are topologically defined using stream processors; they hold state that models your use case as events. Almost like a deconstructed real-time database.
In this talk, I step through the origins of event streaming systems, understanding how they are developed from raw events to evolve into something that can be adopted at an organizational scale. I start with event-first thinking, Domain Driven Design to build data models that work with the fundamentals of Streams, Kafka Streams, KSQL and Serverless (FaaS).
Building upon this, I explain how to build common business functionality by stepping through the patterns for: – Scalable payment processing – Run it on rails: Instrumentation and monitoring – Control flow patterns Finally, all of these concepts are combined in a solution architecture that can be used at an enterprise scale. I will introduce enterprise patterns such as events-as-a-backbone, events as APIs and methods for governance and self-service. You will leave talk with an understanding of how to model events with event-first thinking, how to work towards reusable streaming patterns and most importantly, how it all fits together at scale.
With increasing data volumes typically comes a corresponding increase in (non-windowed) batch processing times, and many companies have looked to streaming as a way of delivering data processing results faster and more reliably. Event Driven Architectures further enhance these offerings by breaking centralised Data Platforms into loosely coupled and distributed solutions, supported by linearly scalable technologies such as Apache Kafka(TM) and Apache Kafka Streams(TM).
However, there remains a problem of how to handle changes to operational systems: if a record is the result of business logic, and that business logic changes, what do we do? Do we recalculate everything on the fly, adding in additional latencies for all data requests and potentially breaching non-functional requirements? Or do we run a batch job, risking that incorrect data will be served whilst the job is running?
This talk covers how 6point6 leveraged Kafka and Kafka Streams to transition a customer from a traditional business flow onto an Event Driven Architecture, with business logic triggered directly by real-time events across over 3000 loosely coupled business services, whilst ensuring that the active development of these services (and their containing logic and models) would not affect components which relied on data served by the platform.
Learn how we:
– Used versioning of topics, data and business logic to facilitate iterative development, ensuring that the reprocessing of large volumes of data would not result in incorrect or stale data being delivered.
– Handled distributed versioning of JSON event messages between separate teams/services, using discovery, automated contract negotiation and version porting.
– Developed technical patterns and libraries to allow rapid development and deployment of new event driven services using Kafka Streams.
– Developed functionality and approaches for deploying defensive services, including strategies for event retry and failure.
n this talk we’ll see how to leverage Apache Kafka not only as an event log, a role for which Kafka is well known (our beloved ‘source of truth’), but also as a persistence engine to implement the CQRS and Event Sourcing pattern. We’ll see how it is possible to use Kafka Streams API to allow a fast access to the current state of a DDD aggregate object and how make it queryable (by the service itself and by other external actors) in order to check if a command can be applied and produce one or more events. We’ll see then how to create a Kafka Stream Topology that uses the produced events to update the state of the aggregate itself, to let the cycle begin again. The goal of this talk is to show that Kafka can be the unique source of truth, not only for what happened (the events) but also for what it is now (the current state of an aggregate). The solution shown is completely Kafka-based, and this allows us to be consistent and avoid the well known problem we face when we have to update a repository and publish an event on the bus, two operations that, in most of the cases, cannot be considered as atomic.
A company’s core business processes nearly always span more than one microservice. In an e-commerce company, for example, a “customer order” might involve different services for payments, inventory, shipping, and more. Implementing long-running, asynchronous, and complex collaboration of distributed microservices is challenging. How can we ensure visibility of cross-microservice flows and provide status and error monitoring? How do we guarantee that overall flows always complete, even if single services fail? Or how do we at least recognize stuck flows so that we can fix them?
In this talk, we’ll demonstrate an approach based on real-life projects using the open-source workflow engine Zeebe (zeebe.io) to orchestrate microservices. Zeebe can connect to Kafka to coordinate workflows that span many microservices, providing end-to-end process visibility without violating the principles of loose coupling and service independence. Once an orchestration flow starts, Zeebe ensures that it is eventually carried out, retrying steps upon failure. In a Kafka architecture, Zeebe can easily produce events (or commands) and subscribe to events that will be correlated to workflows. Along the way, Zeebe provides monitoring and visibility into the progress and status of orchestration flows. Internally, Zeebe works as a distributed, event-driven, and event-sourced system, making it not only very fast but horizontally scalable and fault tolerant–and able to handle the throughput required to operate alongside Kafka in a microservices architecture. And because Zeebe uses gRPC for client-server communication, it’s possible to generate clients in one of ten gRPC-supported programming languages–making Zeebe accessible to a wide range of language communities. Expect not only slides but also live hacking sessions and user stories.
SolarWinds MSP collects and aggregates information from millions of agents via hundreds of intermediate services deployed across the globe. It provides business intelligence, reporting and analytical capabilities to both internal and external clients. Having gone through a massive expansion in the past few years the traditional Extract Transform Load (ETL) pipelines cannot cope with the agility the business demands in order to deliver world className features with minimal engineering friction. The fabric of our data has evolved from cold storage independent silos into distributed interconnected continuous flows of information that demand high resilience and configurable delivery semantics at near real-time.
This talk presents DaVinci EventBus: the fully cloud-native eventing backbone of SolarWinds MSP. Built for scalability, it connects millions of agents through hundreds of micro-services that exchange tens of billions of messages per day deployed in four geographical regions. It exposes a unified gRPC interface that allows clients in different programming languages to seamlessly interact with topics across multiple Kafka clusters. DaVinci EventBus uses Akka to implement self-service topic management, provide high-throughput batch publication, coordinate consumption groups and replicate data while guaranteeing sequential consistency across multiple Kafka clusters.
We dive deep into the design of the DaVinci EventBus and show how Akka can be used to implement an external coordination mechanism that federates multiple Kafka clusters. We discuss our journey of breaking monolithic legacy systems into a set of resilient event-driven micro-services. We show how our event-driven approach massively reduced the data propagation network traffic and simplified the data manipulation and analysis in order to drive new features such as automated anomaly detection to our end users. Further, we expand on our future plans to provide multiple consumption mechanisms on a single event firehose, on-demand automated Kafka cluster deployment, and asynchronous workflow management across multiple micro-service boundaries.
In this talk we’ll look at the relationship between three of the most disruptive software engineering paradigms: event sourcing, stream processing and serverless. We’ll debunk some of the myths around event sourcing. We’ll look at the inevitability of event-driven programming in the serverless space and we’ll see how stream processing links these two concepts together with a single ‘database for events’. As the story unfolds we’ll dive into some use cases, examine the practicalities of each approach-particularly the stateful elements-and finally extrapolate how their future relationship is likely to unfold. Key takeaways include: The different flavors of event sourcing and where their value lies. The difference between stream processing at application- and infrastructure-levels. The relationship between stream processors and serverless functions. The practical limits of storing data in Kafka and stream processors like KSQL.
Developers have long employed message queues to decouple subsystems and provide an approximation of asynchronous processing. However, these queuing systems don’t adequately deliver on the promise of event-driven architectures and often lead to the usage anti-patterns. The events are carrying both notification and state. This allows for developers and data engineers to event-driven systems. Developers benefit from the asynchronous communication that events enable between services, and data engineers benefit from the integration capabilities. In this talk, Viktor will discuss the concepts of events, their relevance to software and data engineers, as well as its power for effectively unifying architectures. You learn how stream processing makes sense in microservices and data integration projects. The talk concludes with a hands-on demonstration of these concepts in practice, using modern toolchain – Kotlin, Spring Boot and Apache Kafka!
Do you wonder how to cope with the right to be forgotten? Do you wonder how to only process the events of individuals who have given their consent for processing their data? Do you wonder how to protect PII data of your users? Or do you wonder how to implement these across all your heterogeneous languages, clients and processing frameworks without having to re-implement all your streaming services? This talk is for you!
In this talk, we will answer these questions and show you
how transparent end-to-end encryption can be implemented on top of Apache Kafka;
how crypto-shredding can be used to forget individuals; and
how record based access control can be implemented on top of Apache Kafka.
Above all, we will show how this can be done without touching any applications by using an out-of-process architecture (Ã la service-mesh).
Microservices are seen as the way to simplify complex systems, until you need to coordinate a transaction across services, and in that instant, the dream ends. Transactions involving multiple services can lead to a spaghetti web of interactions. Protocols such as two-phase commit come with complexity and performance bottlenecks. The Saga pattern involves a simplified transactional model. In sagas, a sequence of actions are executed, and if any action fails, a compensating action is executed for each of the actions that have already succeeded. This is particularly well suited to long-running and cross-microservice transactions. In this talk we introduce the new Simple Sagas library (https://github.com/simplesourcing/simplesagas). Built using Kafka streams, it provides a scalable fault tolerance event-based transaction processing engine. We walk through a use case of coordinating a sequence of complex financial transactions. We demonstrate the easy to use DSL, show how the system copes with failure, and discuss this overall approach to building scalable transactional systems in an event-driven streaming context.
MQ, ETL and ESB middleware are often used as integration backbone between legacy applications, modern microservices and cloud services. This introduces several challenges and complexities like point-to-point integration or non-scalable architectures. This session discusses how to build a completely event-driven streaming platform leveraging Apache Kafka’s open source messaging, integration and streaming components to leverage distributed processing, fault-tolerance, rolling upgrades and the ability to reprocess events. Learn the differences between a event-driven streaming platform leveraging Apache Kafka and middleware like MQ, ETL and ESBs – including best practices and anti-patterns, but also how these concepts and tools complement each other in an enterprise architecture.
Walmart.com generates millions of events per second. At WalmartLabs, I’m working in a team called the Customer Backbone (CBB), where we wanted to upgrade to a platform capable of processing this event volume in real-time and store the state/knowledge of possibly all the Walmart Customers generated by the processing. Kafka streams’ event-driven architecture seemed like the only obvious choice. However, there are a few challenges w.r.t. Walmart’s scale: – the clusters need to be large and the problems thereof. – infinite retention of changelog topics, wasting valuable disk. – slow stand-by task recovery in case of a node failure (changelog topics have GBs of data) – no repartitioning in Kafka Streams.
As part of the event-driven development and addressing the challenges above, I’m going to talk about some bold new ideas we developed as features/patches to Kafka Streams to deal with the scale required at Walmart.
– Cold Bootstrap: Where in case of a Kafka Streams node failure, how instead of recovering from the change-log topic, we bootstrap the standby from active’s RocksDB using JSch and zero event loss by careful offset management.
– Dynamic Repartitioning: We added support for repartitioning in Kafka Streams where state is distributed among the new partitions. We can now elastically scale to any number of partitions and any number of nodes.
– Cloud/Rack/AZ aware task assignment: No active and standby tasks of the same partition are assigned to the same rack.
– Decreased Partition Assignment Size: With large clusters like ours (>400 nodes and 3 stream threads per node), the size of Partition Assignment of the KS cluster being few 100MBs, it takes a lot of time to settle a rebalance.
Key Takeaways: ‘
– Basic understanding of Kafka Streams.
– Productionizing Kafka Streams at scale.
– Using Kafka Streams as Distributed NoSQL DB.
Cryptocurrency exchanges like Coinbase, Binance, and Kraken enable investors to buy, sell and trade cryptocurrencies, including Bitcoin, Litecoin, Ethereum and many more. Depending on the exchange, trades can be made using fiat currencies (legal government tender like U.S. dollars or Euros) or other cryptocurrencies. Most exchanges also allow investors to purchase one type of cryptocurrency with another (for example, buy Bitcoin with Ethereum.) Given the high velocity and high volatility of cryptocurrency valuations, monitoring and analyzing trading activity and the performance of trading algorithms is daunting. Kafka Streams provides a perfect infrastructure to support visibility into the market and participant behavior with a very high degree of temporal accuracy, which is critical when trading such volatile instruments.
In particular, cryptocurrency traders need several ways to visualize trading activity and rebuild and view their order books at full depth. They need tools that can make large numbers of complex real-time calculations, including:
– Best bids and offers
– Cumulative sizes for bids and offers
– Spread to the median
– Deltas between price events
– Time-weighted averages
– Message rates (new, cancel, trade, replace)
– Cumulative trade flow
All calculations must be done for multiple pairs of fiat currencies and cryptocurrencies in real time throughout the trading day. Traders must have visibility into all aspects of every order through to execution. New tools leverage the power of Kafka Streams that enable traders themselves to build directed graphs on screen, without writing any code. A directed graphs control data flows, calculations, and statistical analysis of cryptocurrency trading data and can output it the screen for in depth monitoring and analysis of real time data as well as historical trading data stored in in-memory time series databases. This paper describes practical approaches to building and deploying Kafka Streams to support cryptocurrency trading.
Do you think that writing simple, expressive code to react to event streams in real-time can sometimes look just a little too easy ? Or perhaps you’re already a seasoned stream processing expert ? Either way, come prepared to have some fun and guess the answers to our streaming brain-teasers as we highlight some misconceptions and things that may surprise you in the brave new world of continuous stream processing!
Kafka Streams performance monitoring and tuning is important for many reasons, including identifying bottlenecks, achieving greater throughput, and capacity planning. In this talk we’ll share the techniques we used to achieve greater performance and save on compute, storage, and cost. We’ll cover: Identifying design bottlenecks in by reviewing logs, metrics, and serdes. State store access patterns, design, and optimization Using profiling tools such as JMX, YourKit etc. Performance tuning of Kafka and Kafka Streams configuration and properties. JVM optimization for correct heap size and garbage collection strategies. Functional programming and imperative programming trade offs.
Kafka Streams and the addition of KSQL has provided opportunities do stateful processing of data. Sometimes, the biggest challenge is determining how you can join that data. Keying and windowing are core concepts that need to be understood in order to properly and efficiently stream data. In this presentation, Neil will utilize geospatial data to showcase non-trivial joining; particularly, but not limited to, distance comparisons. The stream processing will be written in Kafka Streams DSL and in KSQL with the topologies being compared. KSQL 2.0 concepts of User Defined Functions (UDFs), nested AVRO structures, and ‘insert into’ functionality of KSQL will be showcased.
The presentation will show a custom OpenSky Connector for obtaining real-time aircraft, a Streams application for processing that data, a D3 topojson application to visualize the data, and an addition KSQL implementation of the streams application for comparison. Expect a deep dive into the Streams DSL and KSQL implementations that will provide the bases into a discussion around Apache Kafka and stream processing.
KSQL is a streaming SQL engine for Apache Kafka. The focus of this talk is to educate users on how to build, deploy, operate, and maintain KSQL applications. It is meant for developers and teams looking to leverage KSQL to build production data pipelines. The audience will get an overview of how KSQL works, how to test their KSQL applications in development environments, the deployment options in production, and some common troubleshooting techniques for when things go wrong. The talk will cover the latest best practices for running KSQL in production, as well as look forward to what we plan to do to improve the KSQL operational experience.
Microservices, events, containers, and orchestrators are dominating our vernacular today. As operations teams adapt to support these technologies in production, cloud-native platforms like Cloud Foundry and Kubernetes have quickly risen to serve as force multipliers of automation, productivity and value. Kafka is providing developers a critically important component as they build and modernize applications to cloud-native architecture. This talk will explore:
• Why cloud-native platforms and why run Kafka on Kubernetes?
• What kind of workloads are best suited for this combination?
• Tips to determine the path forward for legacy monoliths in your application portfolio
• Running Kafka as a Streaming Platform on Container Orchestration
High-speed and low footprint data stream processing is high in demand for Kafka Streams applications. However, how to write an efficient streaming application using the Streams DSL has been asked by many users in the past since it requires some deep knowledge about Kafka Streams internals. In this talk, I will talk about how to analyze your Kafka Streams applications, target performance bottlenecks and unnecessary storage costs, and optimize your application code accordingly using the Streams DSL.
In addition, I will talk about the new optimization framework that we have been developed inside Kafka Streams since the 2.1 release which replaced the in-place translation of the Streams DSL into a comprehensive process composed of streams topology compilation and rewriting phases, with a focus on reducing various storage footprints of Streams applications, such as state stores, internal topics etc.
This talk is aimed to give developers who are interested to write their first or second Streams applications using the Streams DSL, or those who have already launched several services written in Kafka Streams in production but wants to further optimize these applications a better understanding on how different Streams operators in the DSL are being translated into the streams topology during runtime. And by having a deep-dive into the newly added optimization framework for Streams DSL, audience will have more insight into the kinds of optimization opportunities that are possible for Kafka Streams, that the library is trying to tackle right now and in the near future.
Kafka Streams is a flexible and powerful framework. The Domain Specific Language (DSL) is an obvious place from which to start, but not all requirements fit the DSL model. Many people are unaware of the Processor API (PAPI) – or are intimidated by it because of sinks, sources, edges and stores – oh my! But most of the power of the PAPI can be leveraged, simply through the DSL ”#process” method, which lets you attach the general building block ”Processor” interface to your -easy to use- DSL topology, to combine the best of both worlds.
In this talk you’ll get a look at the flexibility of the DSL’s process method and the possibilities it opens up. We’ll use real world use-cases borne from extensive experience in the field with multiple customers to explore power of direct write access to the state stores and how to perform range sub-selects. We’ll also see the options that punctuators bring to the table, as well as opportunities for major latency optimisations.
* Understanding of how to combine DSL and Processors
* Capabilities and benefits of Processors
* Real-world uses of Processors
Debezium (noun | de·be·zi·um | /dɪ:ˈbɪ:ziːəm/) – Secret Sauce for Change Data Capture Apache Kafka has become the defacto standard for asynchronous event propagation between microservices. Things get challenging though when adding a service’s database to the picture: How can you avoid inconsistencies between Kafka and the database? Enter change data capture (CDC) and Debezium. By capturing changes from the log files of the database, Debezium gives you both reliable and consistent inter-service messaging via Kafka as well as instant read-your-own-write semantics for services themselves. Join this session to learn how to leverage CDC for reliable microservices integration and solving typical challenges such as gradually extracting microservices from existing monoliths, maintaining different read models in CQRS-style architectures, and updating caches as well as full-text indexes. You’ll find out how Debezium streams all the changes from datastores such as MySQL, PostgreSQL, SQL Server and MongoDB into Kafka, and how Debezium is designed to not compromise on data correctness and completeness also if things go wrong. In a live demo we’ll show how to use Debezium to set up a change data stream out of your application’s database, without any code changes needed. You’ll see how to consume change events in other services, how to gain real-time insight into your changing data using Kafka Streams and much more.
We’ve built a real-time streaming platform that enables prediction based on user behavior, with events occurring in virtual and augmented reality environments. The solution enables organizations to train people in an extended reality environment, where real-life training may be costly and dangerous. Kafka Streams enables analyzing spatial and event data to detect gestural feature and analyze user behavior in real-time to be able to predict any future mistake the user might make. Kafka is the backbone of our real-time analytics and extended reality communication platform with our cluster and applications being deployed on Kubernetes.
In this talk, we will mainly focus on the following: 1. Why Extended Reality with Kafka is a step in the right direction. 2. Architecture & Power of Schema Registry in building a generic platform for pluggable XR apps and analytics models 3. How KSQL and Kafka Streams fits in Kafka Ecosystem to help analyze human motion data and detect features for real-time prediction. 4. Demo of a VR application with real-time analytics feedback, which assists people to be trained in how to work with chemical laboratory equipment.
Kafka Streams is a library for developing applications for processing records from topics in Apache Kafka. It provides high-level Streams DSL and low-level Processor API for describing fault-tolerant distributed streaming pipelines in Java or Scala programming languages. Kafka Streams also offers elaborate API for stateless and stateful stream processing. That’s a high-level view of Kafka Streams. Have you ever wondered how Kafka Streams does all this and what the relationship with Apache Kafka (brokers) is? That’s among the topics of the talk.
During this talk we will look under the covers of Kafka Streams and deep dive into Kafka Streams’ Fault-Tolerant Distributed Stream Processing Engine. You will know the role of StreamThreads, TaskManager, StreamTasks, StandbyTasks, StreamsPartitionAssignor, RebalanceListener and few others. The aim of this talk is to get you equipped with knowledge about the internals of Kafka Streams that should help you fine-tune your stream processing pipelines for better performance.
KSQL allows users to build their own custom functions for processing data in Kafka. Most of the examples that can be found online are helpful, but are often simplistic. This discussion will dive deeper than most tutorials on the subject by showing examples of more advanced UD(A)Fs, and will inspire listeners to venture beyond the hello world tutorials into more exciting territory. First, we will learn how to remove most of the boilerplate that is involved when building even simple KSQL functions. Then, building on this foundation, we will build UDFs that leverage both cloud-based Machine Learning / AI services and also embedded predictive models. Finally, we will venture into more experimental territory and leverage GraalVM + polyglot programming to build multilingual UDFs (UDFs that are written in languages other than Java).
In this all too fabulous talk we will be addressing the wonderful and new wonders of KSQL vs. KStreams. If you are new-ish to Kafka…you may ask yourself, “What is a large Kafka deployment?” And you may tell yourself, “This is not my beautiful KSQL use case!” And you may tell yourself, “This is not my beautiful KStreams use case!” And you may ask yourself, “What is a beautiful Kafka use case?” And you may ask yourself, “Where does that stream process go to?” And you may ask yourself, “Am I right about this architecture? Am I wrong?” And you may say yourself, “My God! What have I done?”
In this talk, we will discuss the following concepts:
1. KSQL Architecture
2. KSQL Use Cases
3. Performance Considerations
4. When to KSQL and When to Not
5. Introduce KStreams
What this talk is: You will understand the architecture and the power of the KSQL continuous query engine and when to use it successfully.
What this talk is not: An intensive KStreams talk – but you will get enough under your belt to go forth and learn more about Stream Processing overall.
Serverless promises the potential to programmatically autoscale cloud resources for genuine pay-as-you go compute. However, in our experience at HomeAway, we found critical gaps with the offerings in what Cloud Providers provided in their core Function as a Service (FaaS) and operational necessities we required to effectively run at scale. These gaps include not being data-centric, a lack of operational support services, inability to employ heterogeneous compute, and a lack of support for a running a wider array of workloads. This lead us ask a simple question – given a function, how can a developer easily deploy to any cloud provider, in any region with scalability, observability, routing, and all the other support services expected by any good developer? More importantly, how can one reconcile FaaS with the existing trends in data-centric, custom hardware demands of Stream Analytics and Machine Learning workloads. This experience at HomeAway led us to build our own FaaS with Kafka as a central backbone. This solution provides us with a clear roadmap to leverage Streaming SQL (KSQL) as well as User Defined Scalar Functions (UDF) and User Defined Aggregate Function (UDAF). We would like to share MultiFaaS, a composable serverless platform for offering highly available, scalable compute to the widest possible audience of customers. It enables developers and non-developers to access compute and be close to data with zero-barrier to entry. MultiFaaS supports varying types of workloads – not only traditional Microservices, but also Streaming Analytics and Machine Learning. It does this while provides seamless integration for scaling, observability, communication, and data. It increases development velocity by removing overhead and allowing users to focus on solving problems.
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to a handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics: Network flow use cases and why this data is important. Reference architectures from production systems at a major international Bank. Why Kafka and Druid and other OSS tools for Network flows. A demo of one such system.
At Flipkart, the leading e-commerce company in India, our team processes the financial data for millions of transactions every day. Financial data processing requires exactly-once semantics because of customer sensitivity and financial regulatory requirements. The financial data entities have a long lifecycle for updates and we maintain the idempotency for many months and sometimes years. In addition, we provide near real-time view to our stakeholders on the ‘state’ of the processing. Let’s consider a specific example of a data processing request where we aggregate the Invoices into Payments for our sellers. We have to keep the number of payments as Constant to cope with the varying and increased number of invoices. In addition, the system has to maintain an audit trail of paid invoices versus the corresponding payment indents, to comply with Auditing procedures. The need for scalability and real-time Service License Agreement (SLA) steered us towards adopting a stream processing solution such as Kafka.
In this talk, we explain our efforts towards a generic framework using Kafka for all use cases of Financial Data Processing in Flipkart and how we are faring on that challenge. We also explain: – The techniques used to achieve the exactly-once semantics within constraints specific to Financial data processing at Flipkart – Various challenges faced in the process and the use of technologies such as Storm Trident and MongoDB to overcome them. In addition, we also describe how this success has emboldened us to rethink our entire architecture on top of streaming infrastructure.
We all know real-time data has a value. But how do you quantify that value in order to create a business case for becoming more data, or event, driven? The first half of this talk will explore the value of data across a variety of organizations, starting with the five most valuable companies in the world: Apple, Alphabet (Google), Microsoft, Amazon, and Facebook. We will go on to discuss other digital natives; Uber, Ebay, Netflix and LinkedIn before exploring more traditional companies across Retail, Finance and Automotive. Whether organizations are using data to create new business products and services, improving user experiences, increasing productivity, managing risk or influencing global power, we’ll see that fast and interconnected data, or ‘event streaming’ is increasingly important. After showing data value can be quantified, the second half of this talk will explain the five steps to creating a business case around Kafka use cases.
Most businesses focus on:
1. Making more money, or conferring competitive advantage to make more money
2. Increasing efficiency to save money, and / or
3. Mitigating risk to the business, to protect money.
We’ll walk through examples of real business cases, discuss how business cases have evolved over the years and show the power of a sound business case. If you’re interested in Big Money and Big Business, as well as Big Data, this talk is for you.
Betfair is the largest online betting operation in the world, busier than London Stock Exchange at times. At Betfair, our biggest customers trade at ultra high frequencies pouring in millions of dollars into our trading systems. As such, low latency and reliability is key to everything we build. Our customers need to be able to view the current positions on offer on a market, place their orders accordingly and see them fulfilled reliably, all of which needs to happen in a few milliseconds. As global business, we also have a steady growth in a number of jurisdictions we operate in, so the number of customers operating at such frequencies are going up every single day. We need our exchange trading platform to be resilient, reliable, fast and easily scalable. On a busy Saturday afternoon when there is popular football going on, we see in excess of 200k transactions per second across our estate, and 99.9% of them being served with an SLA of 10ms. This used to be about 40k transactions per second a few years ago. So in order to get from that point to the present day, we needed to fundamentally re-engineer our backend systems, to be largely event driven, and Kafka was the perfect tool to help us solve this problem. On the back of the success of that platform, we are now rebuilding the core of our exchange platform that accepts and matches up to 25000 orders per second. Ordering of events is key to achieving this reliably, and again Kafka is at the centre of our solution here.
This presentation details some of the key scalability, reliability and resiliency challenges that we faced in this migration and how we overcame them, to rebuild our entire exchange with Kafka at its core.
Natural Language Processing (NLP) focuses on the analysis and understanding of textual information either in written or spoken form. In recent years, NLP technologies has become business critical due to the overwhelming and ever-increasing amount of textual information to get customer or business process insights, but also to empower novel user experience by creating dialogue-based digital assistants understanding customer language. In our talk, we will explore how NLP use cases significantly benefit from stream processing and event driven architectures. We will present the NLP Service Framework representing a stream processing framework using Kafka in which NLP tasks run as microservices orchestrated in pipelines to perform complex end-to-end services. In the NLP Service Framework, Kafka is being used to orchestrate data flows containing of all kinds of textual information in different topics related to specific use cases. Different Kafka Streams based processors subsequently call NLP services to analyze and annotate the textural information within the data flows. Various applications like search-based application based upon Elasticsearch and Kibana or analytical databases eventually consumes the textual information that is augmented with annotations and inferred results of the NLP Services. Two important requirements of the NLP Service Framework are efficient communication between different services using REST interfaces and interoperability among services implemented in different languages such as Java or Python. We implement the gRPC framework and use ProtoBuff as data format to ensure both requirements. This Kafka-based architecture enables us to specify domain-specific but isolated end-to-end NLP services and guarantees highly scalable and robust handling of high volume of textual data from different BMW domains along the value chain, including customer, process, and vehicle data.
Facing Open Banking regulation, rapidly increasing transaction volumes and increasing customer expectations, Nationwide took the decision to take load off their back-end systems through real-time streaming of data changes into Kafka. Hear about how Nationwide started their journey with Kafka, from their initial use case of creating a real-time data cache using Change Data Capture, Kafka and Microservices to how Kafka allowed them to build a stream processing backbone used to reengineer the entire banking experience including online banking, payment processing and mortgage applications. See a working demo of the system and what happens to the system when the underlying infrastructure breaks. Technologies covered include: Change Data Capture, Kafka (Avro, partitioning and replication) and using KSQL and Kafka Streams Framework to join topics and process data.
In the “Talking Traffic Partnership” (https://www.talking-traffic.com/en) the Dutch Ministry of Infrastructure and Environment collaborates with several public and private parties to deliver up-to-date traffic information from a wide variety of data sources to road users via smartphones and personal or onboard navigation systems. KPN was selected as IT partner for Talking Traffic, and Klarrio was commissioned by KPN to build a platform that could: – Act as a secure streaming information exchange between the Talking Traffic partners. – Deliver personalized subsets of selected data streams to millions of connected client devices and applications in real time.
In this talk we will walk you through the production platform, describing how our partners run containerized Kafka applications against a secured multi-tenant Kafka setup to amass a wealth of traffic information.
We will show:
– How we manage tenants, data streams, and access to streams.
– How we protect the platform from rogue applications.
– How data provenance is handled in the platform.
– How tenants are given insight into the operation and performance of their Kafka applications. We then show how we created a scalable messaging layer on top of this information backbone, enabling us to disseminate relevant traffic information towards millions of connected road users over MQTT.
We focus on: – How we deliver millions of individualized subsets of the data on Kafka with minimal data amplification. – How we implement MQTT features like wildcard subscriptions and retained messages. By the end of the session you will have learned a way to deploy Kafka in large-scale, multi-tenant environments, and how to quickly and securely stream data from shared internal Kafka topics to consumers outside of the platform.
In this talk, I would like to share the successful experience our team is having implementing Kafka within a complex data architecture. Although I have the blessing of leading a team of incredibly talented Engineers, none of us had the experience of working with Kafka in the scale we face at Mimecast, where hundreds of microservices generate millions of events per second to communicate asynchronously to achieve different goals. The talk will explain how Kafka is helping us to decouple Microservices and make data available for teams and services that were not in communication before. I will highlight the challenges we encountered and how we overcame them, like having one Kafka Cluster per region going across to our double data center architecture and still avoiding a split-brain scenario, serving thousands of producers and consumers, explaining in plain language the main Kafka components and how they are used to solve problems. I would like to share how Kafka is allowing our Data Scientists to explore the data since we are able to replay the input data as many times we need, discovering new features and more importantly, been able to reproduce exactly the same scenario over and over. Last but not least, the talk will emphasize the fact, like in our case, newcomers do not have to pay a steep learning curve to make the intimidating Kafka Platform part of their solution, the documentation is fantastic, the community is amazing and examples could be found all over the internet.
The privacy and security risks associated with using sensitive data can mean that long approval processes counter the ease of engineering and time-saving benefits that Kafka provides. In this talk I will cover privacy attacks that Kafka data streams are particularly vulnerable to, and general techniques that can help thwart these attacks. I will go on to propose an architectural pattern where innovation around streaming sensitive data is performed in dedicated safe-zones that protect the privacy of customer data and data subjects. Finally, I will discuss how a collaboration between Privitar and Confluent has introduced cutting-edge Privacy Engineering techniques to the world of Kafka streaming data through our Privitar Publisher Kafka Connector. This Connector creates safe-zones for data, called Protected Data Domains, which enable separate teams to work on data streams made safe by applying easily-customizable privacy policies that are managed for each Kafka Topic. These managed data releases are watermarked for traceability and retain referential integrity but cannot be linked to each other (a significant privacy and security risk).
NAV handles a lot of applications for our benefits(sickness, unemployment etc.). We are rewriting the systems handling these applications to a stream based approach, placing Kafka-streams in the centre of our architecture. This talk will go through our approach to building these systems. We are mostly using Kotling as our programming language, and are running our applications on Kubernetes. We use Change-data-capture to extract data from our legacy systems, to ease the pain of migration. By treating our data, and our applications as streams we get a lot of benefits. We can rerun our applications with different rules, to simulate the effect of policy changes. We can do real time evaluation of applications, to improve the user experience. We also get better robustness, as we remove the runtime dependency to our core systems, and can receive applications even when other parts of the systems are down. The talk will also present the architecural rules we are implementing for NAV, using streams as the main communication pattern between organization units, and some learnings from our current work on a data platform to support Life is a stream of events.
Climate change is one of mankind’s greatest problems. We’ve made substantial steps forward to increase the amount of renewable generation on our power grids but we’re reaching a critical stage that determines our ability to make renewable energy truly ubiquitous. Power grids are built on the concept of equal supply and demand. In order to achieve this generation needs to be constant and demand predictable. As we move to more renewable sources we uncover substantial issues with the consistency and control with which we can generate energy. Currently this prevents us from truly 100% renewable grids. Using Kafka and IoT connected devices (wall mounted batteries, EVs, storage heaters) we’re building a distributed network of power storage in customers homes that drives down the cost of power for customers and unlocks the potential for a truly carbon free power grid.
In this talk I’ll discuss some of the issues we face trying to create 100% renewable grids and how Kafka helps our various systems process huge amounts of telemetry data from our connected devices, the grid and other systems to make decisions about when to charge, discharge or do nothing to thousands of connected assets. This talk covers Kafka concepts but focuses on climate change and how breakthroughs in technology can help us solve meaningful problems.
The cruise industry has unique set of challenges when it comes to deploying products that directly interact with guests across ships. No matter what the challenges are, Applications that run on ships, always need to provide same functionality as on shore. To name few, Each ship is almost a separate datacenter and there are tens of ships. Second, the ships compute capacity is much lower than what an AWS data center can provide and hence any thing that is deployed on ship need to be lean.Third, the bandwidth, though improving, is still a big constraint in transferring data between ship and shore. Fourth, The company’s legacy systems power the core booking functionality and that information is still vital to every feature or application that happens across enterprise including ships.
Hence traditionally cruise industry shied away from real time integration and mostly resorts to a ‘rollover’ process, where in the data gets transferred at the beginning or end of the cruise. But this can’t be the solution with digital. Mobiles need to allow stream of changes real time, as guests go from ship to shore to ship or ship to ship or shore to ship. Our customer, a leading cruise company instituted a team to solve these synchronization issues and also enable event driven architecture. This presentation will address the evolution, idea and implementation of data synchronization and enabling event driven organization using Kafka and streams technologies at its core.
This talk will focus on the move from a monolithic solution to an event driven microservices architecture to allow each of our partners to offer their clients a customizable and localized Vitality offering that is based on a consistent global experience. It will include details on how we manage client demands and legal and regulatory issues specific to each market, how we can rapidly implement Vitality into a country within the space of a few months and then how we manage and support each market. It will describe wow we moved off proprietary technologies to a cloud agnostic open source, horizontally scalable hosted solution that takes advantage of technologies such as Kafka and Kubernetes etc. and the challenges faced in doing this.
In addition it will provide detail as to how Vitality streams exercise activity data real time from a multitude of device manufacturers (such as Fitbit, Garmin, Suunto, Apple) and routes it to the correct Vitality instance to analyze and allocate points to the member. Globally we process on average 50 -60 million member workouts a week. As well as how we integrate with a number of rewards partners to ensure members can access and utilize their rewards through local partners (gyms, airlines and cinemas) as well as global partners such as Starbucks, Hotels.com and Amazon.