Kafka Connect HDFS

By Confluent, Inc.

confluentinc/kafka-connect-hdfs:5.0.0
31100%
Version

5.0.0

Features

  • Single Message Transform
  • Control Center Integration
  • Kafka Connect API

Tags

Kafka Connect HDFS

Confluent, Inc.

confluentinc/kafka-connect-hdfs:5.0.0
31100%

The HDFS connector allows you to export data from Kafka topics to HDFS files in a variety of formats and integrates with Hive to make data immediately available for querying with HiveQL.

The connector periodically polls data from Kafka and writes them to HDFS. The data from each Kafka topic is partitioned by the provided partitioner and divided into chunks.

Each chunk of data is represented as an HDFS file with topic, kafka partition, start and end offsets of this data chunk in the filename. If no partitioner is specified in the configuration, the default partitioner which preserves the Kafka partitioning is used. The size of each data chunk is determined by the number of records written to HDFS, the time written to HDFS and schema compatibility.

The HDFS connector integrates with Hive and when it is enabled, the connector automatically creates an external Hive partitioned table for each Kafka topic and updates the table according to the available data in HDFS.

More......Less

Install your connector

Use the Confluent Hub client to install this connector with:

confluent-hub install confluentinc/kafka-connect-hdfs:5.0.0

Or download the ZIP file and extract it into one of the directories that is listed on the Connect worker's plugin.path configuration properties. This must be done on each of the installations where Connect will be run. See here for more detailed instructions.

Configure an instance of your connector

Once installed, you can then create a connector configuration file with the connector's settings, and deploy that to a Connect worker. See here for more detailed instructions.

The source code is located in this repository.

For more information, see the documentation.

Confluent supports the HDFS sink connector alongside community members as part of its Confluent Platform open source offering.
Rate this connector

We use cookies to understand how you use our site and to improve your experience. Click here to learn more or change your cookie settings. By continuing to browse, you agree to our use of cookies.