실시간 움직이는 데이터가 가져다 줄 가치, Data in Motion Tour에서 확인하세요!

Current 2022

Current 2022 세션과 발표자료 보기

기조 연설

Apache Kafka : 과거, 현재, 미래

  • Jun Rao, Confluent
  • Pritha Mehra, The United States Postal Service
  • Chunyan Wang, Pinterest

In this keynote, Jun Rao will focus on the community and ecosystem that powers Kafka, the current state of the project and recognize recent contributions. We’ll hear how devs and organizations are using Kafka in their businesses and dive deep into what’s coming.

기조 연설: 스트리밍 시대를 위한 데이터 파이프라인 재구성

  • Chad Verbowski, Confluent
  • Erica Schultz, Confluent
  • Greg DiMichillie, Confluent
  • Andrew Hartnett, New Relic

Join Confluent executives in this keynote to learn more about the fundamental principles to reinvent data pipelines, so you can rapidly access high-quality, ready-to-use data for your real-time use cases. Hear about the launch of Stream Designer; an innovation in Confluent Cloud.

스트리밍 시대의 시작

  • Jay Kreps, Confluent
  • Gian Merlino, Imply
  • Anush Kumar, Expedia Group

In this keynote, CEO & Cofound of Confluent, Jay Kreps, joined by fellow industry leaders, will dive into the emergence of data streaming as a full category that, while still having Kafka at its core, has expanded into a broad and growing ecosystem of data movement and real-time technologies.

세부 세션

Kafka와 Kubernetes의 더 나은 연결

  • Stefan Sprenger, DataCater
  • Hakan Lofcali, DataCater

This talk proposes a novel, cloud-native deployment model for Kafka Connect, which uses the different concepts of Kubernetes for executing, scaling, and isolating single Kafka Connect connectors. In a nutshell, we build unique container images for each Kafka Connect connector type.

메시징 API 설계 집중 과정

  • Jack Vanlightly, Confluent

In this talk we're going to look at a variety of different messaging APIs, contrasting their features and guarantees with their ""heaviness"".

Apache Nifi 프레임워크 보안 향상

  • David Handermann, Cloudera

This presentation covers the implementation details involved with automatic certificate generation, password-based key derivation, JSON Web Token signing, repository encryption, and sensitive property management using external services.

스트리밍에 대한 분석 엔지니어 가이드

  • Amy Chen, dbt Labs

In this talk, we will explore what streaming in a batch-based analytics world should look like. How does that change your thoughts about implementing testing and performance optimization in your data pipelines? Do you still need dbt?

Spark Structured Streaming과 Apache Kafka

  • Emma Liu, Databricks
  • Nitin Saksena, Albertsons Companies
  • Ram Dhakne, Confluent

In this talk, you will learn:

  • The built-in streaming capabilities of a lakehouse
  • Best practices for integrating Kafka with Spark Structured Streaming
  • How Albertsons architected their data platform for real-time data processing and real-time analytics

Azure Event Hubs - 비하인드 스토리

  • Kasun Indrasiri, Microsoft

This session is a look behind the curtain where we dive deep into the architecture of Event Hubs and look at the Event Hubs cluster model, resource isolation, and storage strategies and also review some performance figures.

Bootiful Kafka: 메시지를 받아라!

  • Josh Long, VMWare

Spring Boot and Apache Kafka are leaders in their respective fields and it's no surprise that they work well together. Join me, Spring Developer Advocate Josh Long and we'll look at how to use Spring Boot and Apache Kafka to build better, scalable systems and services.

들이쉬고, 내쉬고: Kafka Connect를 올바르게 구성하기

  • Francesco Tisiot, Aiven

We'll talk about streaming data into topics, the data formats to use and what to look out for when Kafka Connect is plugging data from another platform into your setup. Since we don't live in a perfect world, we'll also cover configurations like error tolerance, dead letter queues.

준비하세요! 일일 일괄 처리 작업을 실시간 아키텍처로 전환하기 위한 필드 참고 사항

  • Valerie Burchby, Netflix
  • Xinran Waibel, Netflix

In this session, we will demystify operational complexity of event streaming in the real data engineering world and share best practices learned from developing and maintaining web-scale data systems at Netflix.

데이터 기반 문화와 AI 혁명 구축

  • Gregory Little, Department of Defense

In this session, Greg will discuss what it will take to guide the evolution of technology and culture in parallel: leadership, technology that enables rapid scale and a complete & reliable data flow, and a data driven culture.

탁월한 데이터 스트리밍 센터 구축하기

  • Steve Gonzalez, Confluent
  • Derek Kane, Confluent

This talk explores a solution to overcome common roadblocks and delays to realizing value at your organization - building a Data Streaming Center of Excellence (CoE). We will discuss the keys to success including workstreams and services required of a CoE, repeatable standards and guidance and more.

Kafka 스트림에서 대화형 쿼리 서비스 빌드하기

  • Bill Bejeck, Confluent

In this talk, I'll discuss and demonstrate what's needed to build an RPC mechanism between Kafka Stream instances, including:

  • The background of Interactive Queries
  • Using Spring Boot to expose your Interactive Query Service
  • How to route queries between app instances.

실시간 서버리스 데이터 애플리케이션 빌드하기

  • Joseph Morais, Confluent
  • Adam Wagner, Amazon Web Services

Join this session to see first hand how developers are pairing Confluent's cloud native, serverless Apache Kafka offering with AWS's serverless services to build data apps and platform that scale.

Apache Flink를 사용한 CDC 스트림 처리

  • Timo Walther, Immerok

In this talk, we highlight what it means for Apache Flink to be a general data processor that acts as a data integration hub. Looking under the hood, we demonstrate Flink's SQL engine as a changelog processor that ships with an ecosystem tailored to processing CDC data and maintaining materialized.

스트리밍의 도전 과제, 반대, 그리고 미래

  • Eric Sammer, Decodable

This talk explores the current state of streaming, the most common objections and the reasons behind them, the massive technical and financial drag this has created, and what needs to change before streaming becomes the default way we process continuous data.

카오스 엔지니어링 및 데이터 스테이지 관리 방법

  • Adi Polak, Treeverse

A complex data flow is a set of operations to extract information from multiple sources, copy them into multiple data targets while using extract, transformations, joins, filters, and sorts to refine the results.

올바른 스트리밍 프로토콜 선택하기

  • Sami Ahmed, Confluent
  • Amanda Gilbert, Confluent

In this session, we will set the stage by talking about the strengths and weaknesses of each protocol, and then dive into how Kafka can be leveraged with these different protocols. We will demo different approaches you might take.

실시간 ML 플랫폼의 복잡성을 추상화하기 위한 고려 사항

  • Zhenzhong Xu , Claypot AI

In this talk, we’ll discuss why ML platforms can benefit from a simple and ""invisible"" abstraction. We’ll offer some evidence on why you should consider leveraging streaming technologies even if your use cases are not real-time yet.

대시보드 실행: Kafka 앱 라이브코딩

  • Kris Jenkins, Confluent

We'll start with an empty directory and by the end, you'll have all the foundational pieces of a dashboard that could serve KPIs to everyone in your organisation, or just form the basis of your next lunchtime hacking session.

Data Governance as a Service

  • Vanessa Burckard, Social Security Administration

Learn how our approach of Data Governance as a Service to our customers will help us get ahead of the curve to helps streamline Kafka adoption for new use cases and build a reliable Enterprise Data Mesh as we go.

Kafka Tiered Storage 심층 분석

  • Satish Duggana, Uber

This talk dives into the internals of tiered storage in how we achieve those semantics covering scenarios like new brokers bootstrapped, or brokers having hard failures, or other out-of-sync brokers becoming leaders etc.

증분 처리를 위한 Apache Hudi 설계

  • Vinoth Chandar, Apache Software Foundation
  • Ethan Guo, Onehouse

In this session, we first introduce Apache Hudi and the key technology gaps it fills in the modern data architecture. Bridging traditional data lakes and warehouses, Hudi helps realize the Lakehouse vision, by bringing transactions, optimized table metadata to data lakes

과거에 기반한 성장, Apache Druid 성능 최적화하기

  • Neil Buesing, Kinetic Edge

Let’s start with how to run Apache Druid locally with your containerized-based development environment. While streaming real-time events from Kafka into Druid, an S3 Complaint Store captures messages via Kafka Connect, for historical processing.

소프트웨어로서의 이벤트 기반 인프라

  • Lee Briggs, Pulumi

In this talk, we'll take a high-level look at how infrastructure management has evolved, examine some insights from both sides of the DevOps divide and look at how your organisation could look if you want to create an event-driven infrastructure that was also managed like software.

학계의 이벤트 스트리밍

  • John DesJardins, Hazelcast

The talk will cover the systematic review workflow and obtained results from the academic literature. It will demonstrate best practices of event streaming and real-time applications in academia and research communities using Google Scholar for scholarly literature search.

스키마 없이 진화하는 스키마

  • Andreas Evers, KOR Financial

Upcaster chains allow you to read an old version of a message and bring it to what your logic needs today. The upcasters in the chain describe how to jump from one version to the next. They describe what your logic expects instead of covering all the possible variations that were ever published.

클러스터 간 Apache Kafka® 복제 프로토콜 확장하기

  • Sanjana Kaundinya, Confluent

In this talk, we will go over how you can use the existing replication protocol across clusters. You will learn how to use Cluster Linking to run a multi-region data streaming deployment without the burden and operational overhead of running yet another data system.

불을 지피다: Kafka를 수백만 개의 프로듀서로 확장하기

  • Ryanne Dolan , LinkedIn

This talk discusses a few real-world applications where high fan-in becomes a problem, and presents a few strategies for dealing with it.

모놀리식에서 마이크로서비스로 - Confluent와 함께하는 여정

  • Gayathri Veale, Indeed

If you’re in discussions surrounding engineering platforms at your organization then this talk is for you. If you are a data driven engineering organization with solid leadership with sound decisions behind it, join us for this talk and let’s have a discussion.

데이터에서 더 많은 정보 얻기

  • Kal Yella, Microsoft
  • Luciano Moreira, Microsoft
  • Jacob Bogie, Confluent

Join Microsoft’s Kal Yella, Luciano Moreira, and Confluent’s Jacob Bogie to learn how you can connect multi-cloud and hybrid data to Azure cloud, reducing the complexity and cost associated with building real-time applications and analytics in the cloud.

Spark Structured Streaming 시작하기

  • Dustin Vannoy, Dustin Vannoy Consulting

This session shares techniques for data engineers who are new to building streaming pipelines with Spark Structured Streaming. It covers how to implement real-time stream processes with Apache Spark and Apache Kafka.

이벤트 기반 아키텍처를 위한 GitOps -- Kube 스타일!

  • Duncan Doyle, Red Hat

In this session, we will show how KCP can be used to transform the way you deploy, manage and maintain your event streaming application architecture, topology and deployments.

Kafka와 함께하는 멀티플레이어

  • Ben Gamble, Aiven

Today we’ll walk through building multi-user and multiplayer spaces for games, collaboration, and for creation, leveraging Apache Kafka® for state management, and stream processing to handle conflicts and atomic edits.

고성능 다중 리소스 트랜잭션

  • Kallol Duttagupta, Morgan Stanley
  • Arun Maroli, Morgan Stanley

In this talk we will describe how we addressed each one of these challenges to deliver a modernized, real time trade settlement solution giving attendees the information they need to tackle event driven architecture in the financial data space.

인기 있는 벡터 데이터베이스 시스템을 Kafka가 강화하는 방법

  • Charles Xie, Zilliz
  • Frank Liu, Zilliz

We will walk through the challenges of unified streaming and batching in vector data processing, as well as the design choices and the Kafka-based data architecture.

Netflix가 180억 달러의 콘텐츠 지출을 관리하는 방법

  • Brian Orth, Netflix
  • David Johnson, Netflix

As a business, how does Netflix ensure that our forecasted spend is accurate? How do we enable systems and business processes to be able to move in a highly aligned, loosely coupled way that is so critical to the Netflix Culture?

클라우드 중단에 대한 복원력이 뛰어난 Kafka 아키텍처를 설계하는 방법

  • Julie Wiederhold, Confluent

In this talk, we’ll discuss these in-depth, along with questions you should ask yourself to guide you to the architecture that solves your business needs.

풀 스택 실시간 애플리케이션을 위한 HTTP/2 스트리밍 API

  • Chris Sachs, Swim.inc

We’ll demonstrate real-time maps that dynamically stream the live state of thousands of real-world entities, while only streaming what’s actually visible on screen at any given time. And we’ll close with a whirlwind tour of UX design patterns that showcase how streaming APIs can create live windows.

스트리밍이 답이라면, 왜 여전히 일괄 처리를 수행하고 계신가요?

  • Adi Polak, Treeverse
  • Tyler Akidau, Snowflake
  • Amy Chen, dbt Labs
  • Eric Sammer, Decodable

This panel brings together industry experts with decades of experience building and implementing data systems—both batch and streaming. In a pragmatic look at the landscape, they'll discuss the state of streaming adoption today, if streaming will ever fully replace batch—and indeed.

엔드 투 엔드 추적 구현

  • Roman Kolesnev, Confluent
  • Antony Stubbs, Confluent

This talk will walk through how to use and extend OpenTelemetry Java agent auto instrumentation to achieve full end-to-end traceability in Kafka event streaming architectures involving multi-cluster deployments, the Connect platform, stateful KStream applications and ksqlDB workloads.

시장 데이터 구독 피드의 신뢰성 향상

  • Ruchir Vani, Nasdaq

In this talk we will discuss those challenges and introduce the Nasdaq Cloud Data Service SDK, an Open Source library for Kafka Consumers that tackles these issues and allows for uniform resilience, performance and operations among varied client configurations.

KRaft 소개: Zookeeper가 없는 Kafka

  • Colin McCabe, Confluent

Apache Kafka without Zookeeper is now production ready! This talk is about how you can run without ZooKeeper, and why you should.

Apache Pinot 소개

  • Tim Berglund, StarTree

In this talk, you'll learn how Pinot is put together and why it performs the way it does. You'll leave knowing its architecture, how to query it, and why it's a critical infrastructure component in the modern data stack, particularly in combination with architecture based on Kafka.

Kafka 클라이언트-브로커 상호 작용 – 보이지 않는 것

  • Tom Bentley, Red Hat

Following this talk you’ll know how the Kafka client protocols work in detail and be able to tell your leaders from coordinators! The next time you have a problem you will not only be able to debug it more easily but also understand how to best utilize the Kafka protocol for your applications.

Debezium으로 캐시를 항상 최신 상태로 유지하세요!

  • Gunnar Morling, Red Hat

Join us for this session to learn how to keep read views of your data in distributed caches close to your users, always kept in sync with your primary data stores change data capture.

실시간으로 행동하기

  • Nadine Farah, Rockset

In this tech talk, we’ll cover these aforementioned considerations in detail. We’ll show you how to build a SQL-based, real-time recommendation engine and customer 360 data application using Kafka, Rockset, and Retool.

Kafka 성능 이상 현상의 근본 원인을 밝히기 위한 주요 지표

  • Daniel Kim, New Relic
  • Antón Rodríguez, New Relic

In this talk, we will take a close look at Kafka’s architecture as well as the key infrastructure, JVM, and system metrics you should monitor for each of its components. Then, we will walk through how to diagnose common Kafka performance anomalies through observing patterns in the metrics.

똑똑! 누구세요?

  • Justin Chen, Shopify
  • Dhruv Jauhar, Shopify

Previously at Shopify, a single SSL certificate was used by nearly all clients to connect to our Kafka clusters. As Kafka distinguishes users based on their certificate’s subject, all clients were masked as the same user, and thus we were unable to identify who was connecting.

Kafka로 코알라 세기

  • Simon Aubury, ThoughtWorks

This project is a demonstration of using a Raspberry Pi and camera, Apache Kafka, Kafka Connect to identify and classify animals. Stream transformation performed using ksqlDB processes the individual animal observations to generate dashboards to understand population trends over time.

컨퍼런스에서 상황을 모니터링하기

  • Timothy Spann, StreamNative
  • David Kjerrumgaard, StreamNative

Let's bring this to the different spots around the conference including lunch tables, vendor booths, hotel rooms, and more. I need to know about these readings now, not when I get back home from the conference.

여러 소스, 여러 싱크, 하나의 스트림

  • Joel Eaton, Red Hat

In this session we’ll introduce the concept of the Canonical Stream, an ordered, declarative event stream of information about a thing that exists in the real world, with its own context and governance. The Canon is technology agnostic, and data context agnostic.

Kafka 및 Spark로 백만 건의 보안 위협 완화

  • Arun Janarthnam, Citrix

In this session, we will talk about how, in the last 6 months, 7M risk indicators were triggered and 1M threat mitigating actions were taken, and the integral role Kafka played in achieving it. We would also like to share some interesting ways Kafka is used at Citrix.

최신 데이터 흐름: 데이터 파이프라인을 구축하는 더 나은 방법

  • Andrew Sellers, Confluent

In this session we'll review the Modern Data Flow principles, and discuss them in the context of trends in the data landscape and modern software engineering practices.

데이터 환경 탐색하기

  • Siddharth Desai, Google Cloud
  • Elena Cuevas, Confluent

In this session, learn how organizations can unlock data value using best-in-class, cloud native products on Google Cloud and its partners such as Confluent.

개방형 데이터 플랫폼에서의 차세대 데이터 모델링

  • Doron Porat, Yotpo
  • Liran Yogev, Ziprecruiter

In this talk, we'll share from our journey redesigning the data lake, and how to best address organizational needs, without having to give up on high-end tooling and technology. We are taking this to the next level.

오프 더 체인: Kafka로 블록체인 데이터 확장하기

  • Jan Svoboda, Confluent
  • Alex Stuart, Confluent

This session will explain how slow data on the blockchain can be joined together with fast data in Kafka and published out to other systems. Jan and Alex (two of Confluent’s resident crypto fans) will walk through a prototype of a distributed blockchain application.

OH: 그 마이크로서비스는 SQL 쿼리였어야 했습니다

  • Seth Wiesman, Materialize Inc

This talk will provide a hands-on look at Materialize and show how it can be used to simplify your application development.

1년 후 – 교훈과 미래를 위한 계획

  • Robert Ezekiel, Booz Allen Hamilton

To improve on the speed of benefits and services delivered at the Veterans Affairs (VA), we implemented Kafka last year with a few products in production. In our talk, we will talk through some of the challenges and lessons learned from adopting an event driven architecture.

대기 시간을 줄이고 처리량을 높이기 위한 최적화

  • Artem Livshits, Confluent

In this talk I'll cover a simple, but effective algorithm for auto-tuning effective batch size for low latency and high throughput, adaptive partitioning logic to direct more data to faster brokers, and go through benchmark results that illustrate effectiveness of the new Sticky Partitioner.

실용적 파이프라인: Ksqldb를 사용한 식물토양경보시스템

  • Danica Fine, Confluent

In this session, I’ll talk about how I ingest the data, followed by a look at the tools, including ksqlDB and Kafka Connect, that will help transform the raw data into useful information.

ksqldb를 사용하여 실시간으로 Kafka 데이터 처리하기

  • Michael Drogalis, Confluent

In this talk, we’ll step through the basics of stream processing through ksqlDB, a Kafka-native, SQL-based stream processor. You’ll learn about its core abstractions, how it works, and how you can use it to build modern data pipelines.

토픽을 더하세요

  • Mitch Gitman, T-Mobile

In this talk, I'll explain what we call inbound and outbound Kafka topics and use those concepts as the launching pad to discuss:

  • The importance of separating data capture from data processing.
  • The power of Kafka as a circuit breaker.

Kafka로 실시간 기관 간 데이터 공유

  • Rob Brown, US Citizenship and Immigration Services

US Government agencies are required to share large volumes of data to enable them to execute on their critical missions. Sharing data across agencies is required for implementing US immigration and naturalization processes, issuing passports and Visas.

Kafka Streams를 사용한 공간 데이터의 실시간 처리

  • Ian Feeney, Confluent
  • Roman Kolesnev, Confluent

In this talk, we will first set the scene with a geospatial 101. Then, using a simplified taxi hailing use case, we will look at two approaches for processing spatial data with Kafka Streams.

Confluent를 통한 고객 경험 재구성

  • Phani Bhattiprolu, Slower
  • Ram Dhakne, Confluent

In this session we will showcase how Confluent and Slower partner together to help customers overcome challenges and realize the true value of Confluent Cloud.

클라우드 네이티브 스트리밍 시스템의 상태 관리 재고하기

  • Yingjun Wu, RisingWave Labs

Stream processing is becoming increasingly essential for extracting business value from data in real-time. To achieve strict user-defined SLAs under constantly changing workloads, modern streaming systems have started taking advantage of the cloud for scalable and resilient resources.

Robinhood에서 프로덕션 CDC 수집 파이프라인을 규모에 맞게 실행하기

  • Balaji Varadarajan, Robinhood
  • Pritam K Dey, Robinhood

In this talk, we will describe the evolution of change data capture based ingestion in Robinhood not only in terms of the scale of data stored and queries made, but also the use cases that it supports. We will go in-depth into the CDC architecture built around our Kafka ecosystem.

AWS에서 수천 개의 Kafka Cluster 실행하기

  • Mehari Beyene, Amazon Web Services
  • Tom Schutte, Amazon Web Services

We’ll talk about several topics including (a) monitoring Kafka health, (b) optimizing Kafka to address compute, storage and networking bottlenecks, (c) automating detection and mitigation of infrastructure failures related to compute, storage and networking and (d) continuous software patching.

Kafka Cluster를 위한 Kubernetes 업그레이드 속도 향상

  • Vanessa Vuibert, Shopify

I will go over how to stretch a Kafka cluster across the old and new Kubernetes clusters without adding any extra brokers. Finally, I will discuss how the Kafka brokers in the new Kubernetes cluster get scaled up while the old one gets decommissioned.

스트리밍 데이터를 지원하는 SQL 확장

  • Fabian Hueske, Snowflake

This talk will look at: o Why is this happening? o Who is involved? o How does the process work? o What progress has been made? o When can we expect to see a standard?

스트리밍 101 복습: 참신한 해석

  • Tyler Akidau, Snowflake
  • Dan Sotolongo, Snowflake

This talk will cover the key concepts of stream processing theory as we understand them today. It is simultaneously an introductory talk as well as an advanced survey on the breadth of stream processing theory. Anyone with an interest in streaming should find something engaging within.

레이크하우스로 데이터 스트리밍하기

  • Frank Munz, Databricks

This talk is for data architects who are not afraid of some code and for data engineers who love open source and cloud services.

데이터 엔지니어를 위한 스트리밍 SQL: 차세대 혁신?

  • Yaroslav Tkachenko, Goldsky

In this presentation, I hope to share the discoveries I made over the years in this area, as well as working practices and patterns I’ve seen.

시계열 데이터 스트리밍하기

  • Kenny Gorman, MongoDB
  • Elena Cuevas, Confluent

In this talk, Kenny Gorman and Elena Cuevas will present how Apache Kafka on Confluent Cloud can stream massive amounts of data to Time Series Collections via the MongoDB Connector for Apache Kafka.

Kafka Cluster에서의 팀 협업

  • Maria Berinde-Tampanariu, Confluent

What are the options offered by the Kafka built-in Authorizer, how can the Authorizer be customized and how are integrations with external systems built in order to provide group or role-based access control?

Testcontainers를 사용하여 Kafka 컨테이너 테스트하기: 오랜 여정

  • Viktor Gamov , Kong

In this session, Viktor talks about Testcontainers, a library (that was initially created for JVM, now exists in many languages) that provides lightweight, disposable instances of shared databases, clusters, and anything else that can run in a Docker container!

빅 데이터의 종말

  • Benn Stancil, Mode

In this talk, I’ll share why the next wave of successful data companies will follow the same pattern. Rather than trying to change how we work, they’ll find ways to unambiguously improve it.

데이터베이스 변경의 변형

  • Tim Steinbach, Shopify

This talk describes our journey of ingesting multiple Kafka data streams from thousands of topics and about half a million partitions, storing Apache Iceberg datasets and explaining the issues along the way.

차세대 소비자 재균형 프로토콜

  • David Jacot, Confluent

This talk will unveil the next generation of the consumer rebalance protocol for Apache Kafka (KIP-848) that addresses the shortcomings of the current protocol. We will go through the evolution of the current rebalance protocol, discuss its shortcomings, and present the new rebalance protocol.

무익한 글을 올리는 AI

  • Thomas Endres, TNG Technology Consulting GmbH
  • Jonas Mayer, TNG Technology Consulting GmbH

In this talk, we will give an introduction to NLP, focussing on the concepts of STT, Text Generation and TTS. Using live demos, we will guide you through the process of scraping social media comments, training a text generation model, synthesizing millions of voices and building IoT robot heads.

클라이언트 측 필드 레벨 암호화를 향하여

  • Hans-Peter Grahsl, RedHat

During this demo-driven talk, you will experience how to benefit from

  • a configurable single message transformation (SMT) that lets you perform encryption and decryption operations in Kafka Connect worker nodes without any additional code

Unbundling the Modern Streaming Stack

  • Dunith Dhanushka, Redpanda

This talk first explores the ""classic streaming stack,"" based on the Lambda architecture, its origin, and why it didn't pick up amongst data-driven organizations. The modern streaming stack (MSS) is a lean, cloud-native, and economical alternative to classic streaming architectures.

이벤트 기반 시스템에서 특정 시점 쿼리 활용하기

  • Bobby Calderwood, Evident Systems

In this talk, we'll discuss how the oNote team implemented a point-in-time queryable Event Model repository using Kafka, Git, and CRDTs. We'll also discuss some other technologies that facilitate this pattern.

Kafka를 사용한 웹스케일 워크플로우 엔진

  • Andrey Falko , Salesforce

In this talk, we introduce a workflow engine concept that only uses Kafka to persist state transitions and execution results. The system banks on Kafka’s high reliability, transactionality, and high scale to keep setup and operating costs low.

Kafka에 오신 것을 환영합니다, 잘 오셨어요!

  • Dave Klein, Confluent

I’ll take you through the basics of Kafka—the brokers, the partitions, the topics—and then on and up into the different APIs and tools available to work with it. Consider it a Kafka 101, if you will. We’ll stay at a high level, but we’ll cover a lot of ground.

무엇이 Kafka 파이프라인의 속도를 늦추고 있나요?

  • Ruizhe Cheng, New Relic
  • Pete Stevenson, New Relic

In a live demo, we will introduce an eBPF-based, always-on, CPU profiler to visualize what your Kafka applications are spending time on. We will analyze how much time the Kafka broker spends on handling different requests and responding to polling.

Kafka의 가용성

  • Justine Olshan, Confluent

Using Apache Kafka and Confluent Cloud as a case study, we will dig deeper into how to define good SLOs and SLAs for distributed systems. From there we will discuss ways to improve availability and the changes we made to Confluent Cloud to improve on Kafka's availability story.

Kafka가 정보 소스일 때

  • Ricardo Ferreira, Amazon Web Services

In this session, we will get into the weeds of data serialization with schemas. We will discuss the differences between formats like JSON, Avro, Thrift, and Protocol Buffers, and how your code must use each one of them to serialize data.

스트리밍에 일괄 처리가 필요한 경우

  • Konstantin Knauf, Immerok

In this talk, I'll introduce Apache Flink's approach to unified stream and batch processing and discuss - by example - how these scenarios can already be addressed today and what might be possible in the future.

비즈니스가 스트리밍에 뒤처져서는 안 되는 이유

  • Becky Gandillon, Centric Consulting

During this session, you'll learn about how to communicate the value of technology decisions to non-technical co-workers or stakeholders. And we'll talk about some very specific buy-in, enablement, and adoption activities and suggestions for supporting streaming implementations.

망설일 이유가 없습니다. 실시간 수집

  • Heng Zhang, Pinterest
  • Chen Qin, Pinterest

In this talk, we plan to share our near-real-time ingestion system built on top of Apache Kafka, Apache Flink, and Apache Iceberg. We pick ANSI SQL as the common currency to minimize the ""lambda architecture"" learning curve of teams adopting fresh data near-realtime data.

Wikipedia의 이벤트 데이터 플랫폼 또는 JSON도 괜찮아요

  • Andrew Otto, Wikimedia Foundation

This session will describe how and why we built Wikimedia's Event Data Platform using Kafka, JSON and JSONSchemas, and how we make our event data available to the world.

스트림에 *무엇*을 넣으시겠어요? 이벤트 디자인을 위한 패턴 및 사례

  • Adam Bellemare, Confluent

In this talk, Adam covers the main considerations of modeling and implementing events. Data is often modeled as a Fact or a Delta, though the distinction isn't always clear.

급증하는 트래픽에 대비하기

  • Ravindra Bhanot, Twilio

This talk elaborates the challenges that Twilio faced when building such a monitoring platform, which can aggregate customer data and send alerts in a timely manner under SLA.

다운타임 없이 Apache Kafka에서 Confluent로 이동

  • Justin Dempsey, SAS Institute

This session details the journey for moving standalone Kafka to Kafka on K8S. During the session, scope of the journey including Total Cost of Ownership (TCO), technical architecture, and the migration itself will be discussed.

라이트닝 토크

Shopify의 Apache Flink 채택

  • Kevin Lam, Shopify

In this talk, we go over the history and future of Apache Flink adoption at Shopify.

We’ll talk about how and why we went from choosing Apache Flink as the replacement for our existing streaming technologies in 2021, to a year later with a flourishing streaming community.

Apache Kafka 파티션 전체에서 데이터 균형 조정

  • Olena Kutsenko, Aiven

In this talk we'll discuss mechanisms you can use to balance your data, such as keys, composite message key, role of hashing, custom partitions and other things you need to keep in mind when splitting data across partitions.

Quine을 사용하여 Kafka에서 스트리밍 그래프 파이프라인 빌드하기

  • Ryan Wright, thatDot

In this live-coding lightning talk, we'll start from scratch and build a streaming graph data pipeline from start to finish. With our data in Kafka, Quine plugs in and requires just a graph query written in the Cypher graph query language.

매우 안정적인 엔터프라이즈 인프라 빌드하기

  • Grace Zhang, Citigroup

We will share how we: -- drive the data streaming readiness by standardizing Kafka clusters among divergent payment application demands. -- overcome the challenge of designing and implementing Kafka enterprise infrastructure to meet business requirements

Robinhood에서 Kafka Consumer에 배달되지 않은 편지 대기열

  • Sreeram Ramji, Robinhood
  • Wenlong Xiong, Robinhood

This talk discusses how we built libraries, templated micro services and tooling that leverages Postgres and Kafka for safely dealing with dead letters, inspecting and querying them, and republishing them to retry kafka topics for safe reprocessing at a later time.

이벤트 기반 데이터 공유를 위한 피드백 루프 설계하기

  • Teresa Wang, Jet Propulsion Laboratory

In this talk, we will discuss how we overcame these challenges and delivered a fully automated and robust data exchange solution by extending Kafka Connect, leveraging ksqlDB streams/tables and aggregations, and developing custom microservices.

데이터 복원력 및 재해 복구를 위한 토픽 구조 설계하기

  • Justin Lee, Confluent

In this talk, we'll discuss the actual implementation details for the clients and topics that live in multi-cluster environments, including: What naming conventions and patterns should be followed for topics in a multi-cluster architecture? How does this differ between application?

강력 접착제: 분산된 환경에서 시스템 테스트를 간소화하기

  • Ian McDonald, Confluent

In this talk, we will go over how Ducktape solves the problem of multi-service distributed testing, what type of testing it is designed for, and how it simplifies the testing experience for complex real time systems. Get ready to get your hands dirty and learn how to write a test and a service.

Reddit에서 바로 광고주 예산 확보하기

  • Sundeep Yedida, Reddit
  • Nagalakshmi Ramasubramanian

In this talk, we will learn how we leveraged Kafka and Druid to provide real-time aggregations of spend against both daily and lifetime budgets. This led to significant decreases in overdelivery compared to the previous batch system, and savings of $LARGE_NUMBER_OF_DOLLARS

Kafka Streams에서 실패를 처리하기

  • Walker Carlson, Confluent

In this talk, we will cover the changes to the threading model that made more dynamic error handling possible. We will also introduce the Streams handler, which unlocked options to react immediately in cases that would previously cause cascading thread death.

Vox Media의 데이터 애플리케이션 구축에 유용하게 사용된 PubSub

  • Movses Musaelian, Vox Media

This talk will discuss practical tips for architecting and productionalizing scalable and latent data applications that leverage the PubSub model. Attendees will learn about common data messaging capabilities found through the PubSub model and how to leverage PubSub to optimize the performance

Snowflake Sink 커넥터가 Snowpipe의 스트리밍 수집 기능을 사용하는 방법

  • Jay Patel, Snowflake

We’ll discuss streaming ingestion into Snowflake with Snowpipe Streaming and how we utilized it with the Snowflake Sink Connector for Kafka. We will talk about the improvements and then jump onto a demo which uses Docker containers to spin up a Kafka and Kafka connect environment to load data

스트림 처리에 대한 장벽 낮추기

  • Alex Morley, Babylon Health

We found that by using the ""agent"" concept in faust we could provide our engineers with a ""Function as a Service""-like experience specifically for processing events on Kafka streams.

엑사스케일 슈퍼컴퓨터 모니터링

  • Tim Osborne, Oak Ridge National Lab

In this talk we will discuss scaling and planning a system to meet the streaming demands of the world’s only exascale and most energy efficient supercomputer. Tune in to learn more about HPC and how streaming fits in to monitoring large-scale systems.

River 및 Bytewax를 사용한 스트리밍 데이터에 대한 온라인 머신 러닝

  • Zander Matheson, Bytewax

In this session we will look at how to leverage the Python libraries River and Bytewax to build streaming applications on Kafka that use online machine learning techniques.

Pi4/ARM에서 Kafka 실행

  • Jeffrey Needham, Confluent

This talk provides a work-in-progress update of deploying Kafka on aarch64 Linux. Although the new Apple M1 is ARMv8 based, it has a distinct flavor, or ELF format - arm64. Since much of Kafka consists of noarch rpms, or simply, a bag-o-jars, both Linux and macOS have native implementations of Java

실시간 MMORPG 게임에서 성공적으로 사기 탐지하기

  • Abbey Kwak, Kakao

By centralizing the logs that occur in actual game specially MMORPG game, and by detecting and operation anomalies through about more than 300 patterns through KsqlDB, and sharing the know-how gained with game operation

전술 가상 지원(TVA)

  • Jubal Biggs, SAIC

How DOD can manage the military battlefield assets to include integrate signals from a diverse and dynamic set of sensors, including static ground sensors and soldiers worn sensors to provide predictive and operational analytics?

자세히 알아보기: Apache Kafka의 세그먼트

  • Kirill Kulikov, Confluent

In this presentation, we are going to deep dive into the internals of Kafka log mechanisms. We will look in detail at the structure of the commit-log and segments, topic partitions arrangement on disk, log retention for compact and delete policies.

Apache Kafka 기반 데이터 파이프라인 검증하기

  • Subhangi Agarwala, Bloomberg

In this talk, we aim to highlight the importance of integration testing, a critical verification method for stable and reliable large-scale distributed streaming applications. We will also provide a high level overview of our system, challenges faced in moving to a streaming infrastructure

Apache Kafka를 사용하지 않아야 할 때?

  • Kai Waehner , Confluent

When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job? This session explores the DOs and DONTs.