Project Metamorphosis: Unveiling the next-gen event streaming platformLearn More

Kafka Connect: Real-time Data Integration at Scale with Apache Kafka

On-demand recording

Kafka Summit 2016 | Systems Track

Many companies are adopting Apache Kafka to power their data pipelines, including LinkedIn, Netflix, and Airbnb. Kafka’s ability to handle high throughput real-time data makes it a perfect fit for solving the data integration problem, acting as the common buffer for all your data and bridging the gap between streaming and batch systems.

However, building a data pipeline around Kafka today can be challenging because it requires combining a wide variety of tools to collect data from disparate data systems. One tool streams updates from your database to Kafka, another imports logs, and yet another exports to HDFS. As a result, building a data pipeline can take significant engineering effort and has high operational overhead because all these different tools require ongoing monitoring and maintenance. Additionally, some of the tools are simply a poor fit for the job: the fragmented nature of the data integration tools ecosystem lead to creative but misguided solutions such as misusing stream processing frameworks for data integration purposes.

We describe the design and implementation of Kafka Connect, Kafka’s new tool for scalable, fault-tolerant data import and export. First we’ll discuss some existing tools in the space and why they fall short when applied to data integration at large scale. Next, we will explore Kafka Connect’s design and how it compares to systems with similar goals, discussing key design decisions that trade off between ease of use for connector developers, operational complexity, and reuse of existing connectors. Finally, we’ll discuss how standardizing on Kafka Connect can ultimately lead to simplifying your entire data pipeline, making ETL into your data warehouse and enabling stream processing applications as simple as adding another Kafka connector.

Speakers:

Ewen Cheslack-Postava, Engineer, Confluent

Sign Up Now

Start your 3-month trial. Get up to $200 off on each of your first 3 Confluent Cloud monthly bills

New signups only.

By clicking “sign up” above you understand we will process your personal information in accordance with our Privacy Policy.

By clicking "sign up" above you agree to the Terms of Service and to receive occasional marketing emails from Confluent. You also understand that we will process your personal information in accordance with our Privacy Policy.

Free Forever on a Single Kafka Broker
i

The software will allow unlimited-time usage of commercial features on a single Kafka broker. Upon adding a second broker, a 30-day timer will automatically start on commercial features, which cannot be reset by moving back to one broker.

Select Deployment Type
Manual Deployment
  • tar
  • zip
  • deb
  • rpm
  • docker
or
Auto Deployment
  • kubernetes
  • ansible

By clicking "download free" above you understand we will process your personal information in accordance with our Privacy Policy.

By clicking "download free" above, you agree to the Confluent License Agreement and to receive occasional marketing emails from Confluent. You also agree that your personal data will be processed in accordance with our Privacy Policy.

This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising, and analytics partners.