Ahorra un 25 % (o incluso más) en tus costes de Kafka | Acepta el reto del ahorro con Kafka de Confluent
View sessions and slides from Kafka Summit Europe 2021
A talk discussing the rise of Apache Kafka and data in motion plus the impact of cloud native data systems. This talk will cover how Kafka needs to evolve to keep up with the future of cloud, what this means for distributed systems engineers, and what work is being done to truly make Kafka Cloud...
Today’s data is in motion. Enterprises need to see their business and their data in near real time; they need to respond in milliseconds not hours; and they need to integrate, aggregate, curate, and disseminate data within and across production environments.
Organizations have been chasing the dream of data democratization, unlocking and accessing data at scale to serve their customers and business, for over a half a century from early days of data warehousing. They have been trying to reach this dream through multiple generations of architectures...
More at kafkasummit.io
Do you want to know what streaming ETL actually looks like in practice? Or what you can REALLY do with Apache Kafka once you get going—using config & SQL alone? This project integrates live data from the UK rail network via ActiveMQ and data from other sources to build a fully-functioning platform.
Data mesh is a relatively recent term that describes a set of principles that good modern data systems uphold. A kind of “microservices” for the data-centric world. While the data mesh is not technology-specific as a pattern, the building of systems that adopt and implement data mesh principles...
KIP-500 set the vision for Zookeeper-free Kafka. However, even without Zookeeper, the need for consensus never went away. In this talk, we will discuss one of the core community’s initiatives, a native Raft-like protocol used to ensure different brokers can agree on critical pieces of metadata...
Microservices are one of the big trends in software engineering of the last few years. In this session we'll discuss and showcase how open-source change data capture (CDC) with Debezium can help developers with typical challenges they often face when working on microservices.
At Stripe, we operate a general ledger modeled as double-entry bookkeeping for all financial transactions. Warehousing such data is challenging due to its high volume and high cardinality of unique accounts. Apache Pinot works well in synergy with Kafka to provide an excellent solution.
While Kafka has guarantees around the number of server failures a cluster can tolerate, to avoid service interruptions, or even data loss, it is prudent to have infrastructure in place for when an environment becomes unavailable during a planned or unplanned outage.
Bayer selected Apache Kafka as the primary layer for a variety of document streams flowing through several text processing and enrichment steps. Every day, Bayer analyzes numerous documents including clinical trials, patents, reports, news, literature, etc.
A data analytics project for a food processing factory revealed that business problems could be solved and processes improved by implementing streaming applications.
As cyber threats continuously grow in sophistication and frequency, companies need to quickly acclimate to effectively detect, respond, and protect their environments. We’ll discuss the details described in the IT@Intel white paper that was published in Nov 2020 with same title.
Developing cloud native microservices introduced us to many new challenges. One of the most difficult is to build reliable microservices integrations and their data exchange patterns. In this session I will share my 10 years of experience with building microservices...
Can and should Apache Kafka replace a database? How long can and should I store data in Kafka? How can I query and process data in Kafka? This session explains the idea behind databases and different features like storage, queries, transactions, and processing to evaluate when Kafka is a good ...
There's little talk about capacity planning Kafka clusters, it's very much learn as you go, every cluster is different. In this talk Kafka DevOps Engineer Jason Bell takes you through the things that will help you, from broker capacity, thinking about topics and how the other Confluent components...
The Apache Kafka ecosystem is very rich with components and pieces that make for designing and implementing secure, efficient, fault-tolerant and scalable event stream processing (ESP) systems.
In our projects, we often have to query the content of Kafka topics. To that end, we expose REST-APIs based on Kafka Streams’ interactive queries. However, this approach has some shortcomings.
One challenge of widespread adoption of any technology within an organization is balancing organic growth and maintaining standards and best practice. At AO.com - one of the UK's largest online electrical retailers...
Getting data between systems, particularly at scale, is a common challenge faced by data engineers. We will give a short intro to Kafka Connect and container technologies before proceeding to a deep dive into practical applications.
Here's the challenge: we've got a Kafka topic, where services publish messages to be delivered to browser-based clients through web sockets. Sounds simple? It might, but we're faced with an increasing number of messages, as well as a growing count of web socket clients.
Event-Driven Architectures (EDA ) are perceived as mythical objects that instantly transform your systems into ""real-time"" ones! BUT, come to think of it, aren't they already ""real-time""? I mean, adding an item to the cart is pretty much instant in ( most ) webshops.
Kafka has become more than a simple message bus: with a full stack of tooling and new concepts, it’s easy to start deploying complex service meshes, communicating through Kafka, enabling decoupled microservices, stable performance, high scalability and reusability.
Event-driven systems come in different shapes and sizes, and the rules for payload construction are: there are no rules (but there are guidelines). Flexible payloads are both the best and worst thing about event streaming - you never quite know what to expect from each system's payloads.
FREE NOW business is growing rapidly as a ride-hailing industry in general which creates a fair amount of technical challenges related to real-time data aggregation and processing.
In this talk I’ll go over how we built an opinionated Kafka client to easily enable Data Scientists to deploy and own production Kafka consumers, by focusing on writing python functions rather than fighting pitfalls with Kafka.
Whilst Kafka has the ability to encrypt data in transit, it does not have the functionality out of the box to encrypt data at rest. This places the responsibility of encryption of data placed on message queues on developers. Implementing cryptography correctly in our applications is challenging ...
Micronaut is an application framework that provides dependency injection, developer productivity features, and excellent support for Apache Kafka. In this session, we'll explore the ways that Apache Kafka and Micronaut work together to enable us to build fast, efficient, event-driven applications.
Kubernetes became the de-facto standard for running cloud-native applications. And many users turn to it also to run stateful applications such as Apache Kafka. You can use different tools to deploy Kafka on Kubernetes - write your own YAML files, use Helm Charts, or ...
This is a talk about debugging Stream–Table joins – based on my first-hand experience of stumbling into various pitfalls.
AsyncAPI is an open source initiative that makes working with event-driven architectures as easy as working with REST APIs.
MirrorMaker 2, released recently as part of Kafka 2.4.0, allows you to mirror multiple clusters and create many replication topologies. Learn all about this awesome new tool and how to reliably and easily mirror clusters.
ksqlDB, the event streaming database, is becoming one of the most popular ways to work with Apache Kafka®. Every day, there are many questions about the project, but here’s a question with an answer that we are always trying to improve: How does ksqlDB work?
Is your data pipeline under development and you simply want to iterate quickly? Immutability is one of the key and desirable features of Kafka. However, when mistakes happen and you are paged at night you sometimes wish there was an “easy button” to change the log.
A journey into the building of a modern distributed real-time Wi-Fi spying system with the sole intention to have fun and play around.
As we modernize and scale, the demands of hybrid cloud, multiple domains, polyglot computing and Data Mesh require us to also modernize our approach to security.
As an AWS shop, Zillow engineering teams have been using various messaging and streaming services for years. As Zillow 2.0 piled through, new requirements and pain points made us rethink our streaming stack.
Consuming messages in parallel is what Apache Kafka® is all about, so you may well wonder, why would we want anything else? It turns out that, in practice, there are a number of situations where Kafka’s partition-level parallelism gets in the way of optimal design.
A fast and efficient integration of end device data into data processing systems is becoming increasingly important in the Internet of Things. Factors such as secure and reliable data transmission, real-time data processing and the analysis of huge amounts of data afterwards play a major role.
Transaction Banking from Goldman Sachs is a high volume, latency sensitive digital banking platform offering. We have chosen an event driven architecture to build highly decoupled and independent microservices in a cloud native manner and are designed to meet the objectives of Security...
We'll learn:
Apache Kafka is the de facto standard for real-time event streaming, but what do you do if you want to perform user-facing, ad-hoc, real-time analytics too? That's a hard problem.
With Apache Kafka, it's typical to place different events in their own topic. But different event types can be related. Consider customer interactions with an online retailer. The customer searches through the site and clicks on various items before deciding on a final purchase.
You cannot operate what you cannot measure. In this talk, I am going to present the built-in metrics framework of Kafka Streams that supports monitoring Kafka Streams applications.
Building systems around an event-driven architecture is a powerful pattern for creating awesome data intensive applications. Apache Kafka simplifies scalability and provides an event-driven backbone for service architectures.
We wanted to embed a Kafka producer/consumer in C++ and decided to use ""librdkafka"", a robust C/C++ library that is open source, well-maintained, and widely used.
Data such as the location of partitions and the configuration of topics are stored outside of Kafka itself, in a separate ZooKeeper cluster. In 2019, we outlined a plan to break this dependency and bring metadata management into Kafka itself through a dynamic service that runs inside the Kafka...
In our payments platform at Goldman Sachs Transaction Banking, Apache Kafka plays a critical role as the messaging bus in our micro-services architecture. Being a part of the financial service industry we need to ensure high-availability of our platform and quick response time during failures.
When all your stores are closed, e-commerce becomes your bigger store and the most challenging.
Confluent Cloud runs a modified version of Apache Kafka - redesigned to be cloud-native and deliver a serverless user experience.
Cut the time of application delivery by reusing Kafka data structure between projects! Expecting boundaries and data definitions to remain consistent between source and consuming projects can be a constant source of surprise - a Kafka spiderweb.
SSE can be used in apps like live stock updates, that use one way data communications and also helps to replace long polling by maintaining a single connection and keeping a continuous event stream going through it.
Representations of data, e.g., describing news, persons or places, differ. Therefore, we need to identify duplicates, for example, if we want to stream deduplicated news from different sources into a sentiment classifier.
Describing the convenience of building an event driven application using stream processing and leveraging the power of KSQL. Events model our lives and actions be it machine or human generated.
Using a publish-subscribe messaging system like Apache Kafka is a great way to minimise coupling between your applications. The stream history that Kafka provides allows consumers to come and go, without the producers ever being aware.
Joins in Kafka Streams and ksqlDB are a killer-feature for data processing and basic join semantics are well understood. However, in a streaming world records are associated with timestamps that impact the semantics of joins: welcome to the fabulous world of temporal join semantics.
Until recently, the Messaging team at Twitter had been running an in-house build Pub/Sub system, namely EventBus (built on top of Apache DistributedLog and Apache Bookkeeper, and similar in architecture to Apache Pulsar) to cater to our pubsub needs.
The data team at Cloudflare uses Kafka to process tens of petabytes a day. All this data is moved using the 2 foundational Kafka api calls: Produce (api key 0) and Fetch (api key 1).
Rust is a fast, memory-efficient language with a rich type system and a set of tools to make you productive.
Newton Investment Management as a traditional investment management house used to have its IT systems communicating with each other by connecting to a centralised database.
Just as the Apache Kafka Brokers provide JMX metrics to monitor your cluster's health, Kafka Streams provides a rich set of metrics for monitoring your application's health and performance.
Some people see their cars just as a means to get them from point A to point B without breaking down halfway, but most of us want it also to be comfortable, performant, easy to drive, and of course - to look good. We can think of Kafka Connect connectors in a similar way.
Studying the ""how"" of Kafka makes you better at using Kafka, but studying its ""whys"" makes you better at so much more.
If you have already worked on various Kafka Streams applications before, then you have probably found yourself in the situation of rewriting the same piece of code again and again.
Confluent Cloud makes Devops engineers lives a lot more easier. Yet moving 1500 microservices, 10K topics and 100K partitions to a multi-cluster Confluent cloud can be a challenge.
Our backend system (ERP, CRM, Billing) is completely cloud, asynchronous Microservices and Kafka based. No databases at all.
I'd like to talk about how we manage our Confluent Cloud Kafka clusters.
Kubernetes became the de-facto standard for running cloud-native applications. And many users turn to it also to run stateful applications such as Apache Kafka.
Can and should Apache Kafka replace a database? How long can and should I store data in Kafka? How can I query and process data in Kafka? These are common questions that come up more and more.
Introduction to networking options available in Confluent Cloud Self Serve provisioning of confluent Kafka clusters.
The first question that arises when you start a new EDA project is how to govern the system? An entire ecosystem of applications, backends, events, and APIs must co-exist under the same architecture.
Applying some measure of governance over how schemas are managed helps ensure good quality data, as well as better lineage tracking and governance.
Handling failures is important, but it’s a must have when a product handles sensitive data. It also becomes exponentially harder in the world of microservices, since a failure can happen in any of the services and even in their dependencies.
Legacy systems are the kings of our IT architecture. They rule the evolution of the technology ecosystem that hosts them thanks to the control they have acquired over time on core data and key business processes.
We all love to play with the shiny toys, but an event stream with no events is a sorry sight. In this session you’ll see how to create your own streaming dataset for Apache Kafka using Python and the Faker library.
The planetary boundary layer (PBL) is the lowest part of the atmosphere, ranging anywhere between 100 and 2000 m above the surface of the ground. The planetary boundary layer height (PBLH) plays a vital role in the environment-related study of air pollutants.
Being a pioneer in the interactive gaming industry, SONY PlayStation has played a vital role in implementing technological advancements thus help bringing global video gaming community together.
Schema management is a key component in every big Event Streaming platform. The Schema Registry solution has several advantages: better Data Quality, more performant, Data Evolvability, etc.
Trendyol was established in 2010 to provide a seamless e-commerce experience to our customers and vendors.
In this session I will talk about our experience with the Java Garbage collector and Kafka in production.
Kafka moves blobs of data from one place to another. That's its job. Kafka doesn't care what the blob is or what it looks like. This can be a boon because it's simple and it allows for a multitude of use cases.
Developers creating Apache Kafka client applications to be used with ""locked down"" (for example they're in a Cloud deployment) Kafka service endpoints have a unique problem: These developers do not have access to the Kafka server backends!
The auto industry is midst a data revolution that is transforming how companies do business. Once a scarce resource, data has now become abundant and cheap.
Cloud-native data infrastructure, such as Confluent and Kubernetes, combine well together to enable teams to use declarative spec based automation (GitOps) for deployment and management.
Being a pioneer in the interactive gaming industry, SONY PlayStation has played a vital role in implementing technological advancements thus help bringing global video gaming community together.
While SQL is a simple declarative language, it can be used in very advanced ways when querying streams of data on Kafka
The Apache Kafka ecosystem is very rich with components and pieces that make for designing and implementing secure, efficient, fault-tolerant and scalable event stream processing (ESP) systems.
The data that organizations are required to analyze in order to make informed decisions is growing at an unprecedented rate.
Hermes, Germany's largest post-independent logistics service provider for deliveries, had one main goal—make faster and smarter data-driven business decisions.
APIs have become ubiquitous as a way of exposing the capabilities of the enterprise both internally and externally.
Customers often face challenges while building and managing distributed systems like Apache Kafka as it is a complex and resource intensive process.
We'll look in depth at how Qlik Replicate can be used to continuously stream changes from a source database into Apache Kafka. From there, we'll explore how a purpose-built consumer can be used to provide the bridge between Apache Kafka and an analytics application such as Qlik Sense.
Ever wish you had a way to view and visualize graphically the relationships between schemas, topics and applications? In this talk we will show you how to do that and get more value from your Kafka Streaming infrastructure using an event portal.
We will show how easy it is to build, deploy and run distributed, highly available event streaming applications that analyze data from hundreds of millions of sources - petabytes per day. The architecture is intuitively appealing and blazingly fast.
In this talk we want to deep dive on the different types of joins, with a focus of their temporal aspect. Furthermore, we relate the individual join operators to the overall "time engine" of the Kafka Streams query runtime and explain its relationship to operator semantics.
Learn how Kong Konnect Enterprise can complement Kafka Event Streaming, exposing it to new and external consumers while applying specific and critical policies to control its consumption, including API key, OAuth/OIDC and others for authentication, rate limiting, caching, log processing, etc.
In this session you will learn how to setup and configure the Confluent Cloud with MongoDB Atlas.
In this session, you will learn how Kafka and SingleStore enable modern, yet simple data architecture to analyze both fast paced incoming data as well as large historical datasets. In particular, you will understand why SingleStore is well suited process data streams coming from Kafka.
In this session you will learn about patterns, best practices, and learnings compiled from running MirrorMaker2 in production at every scale.
Apicurio Registry is an end-to-end solution to store API definitions and schemas for Kafka applications. The project includes serializers, deserializers, and additional tooling.
Using Kafka to stream data into TigerGraph, a distributed graph database, is a common pattern in our customers’ data architecture. In this session, we will present the high-level architecture in three different approaches and demo the data streaming process.
Time series data is everywhere -- connected IoT devices, application monitoring & observability platforms, and more. Once this session is complete, you’ll be able to connect a Kafka queue to an InfluxDB instance as the beginning of your own time series data pipeline.
Join a candid conversation between Confluent product managers, Dan Rosanova and Addison Huddy, as they reflect on the top 10 lessons learned building Confluent Cloud.
In this session we will present a new declarative approach to unlock Kafka Streams, called KSML. After this session you will be able to write streaming applications yourself, using only a few simple basic rules and Python snippets.
We'll explore how making changes to the JVM design can eliminate the problems of garbage collection pauses and raise the throughput of applications. For cloud-based Kafka applications, this can deliver both lower latency and reduced infrastructure costs. All without changing a line of code!
Join experts from Confluent and AWS to learn how to build Apache Kafka®-based streaming applications backed by machine learning models. Adopting the recommendations will help you establish repeatable patterns for high performing event-based apps.
Scaling an Event-Driven Architecture with IBM and Confluent | Antony Amanse and Anton McConville, IBM
Azure Labs: Confluent on Azure Container Services & Real-time Search with Redis | Alicia Moniz, Confluent and Ramya Orunganti, Microsoft