Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now

Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | April 2016

Written By

The Apache Kafka community was crazy-busy last month. We released a technical preview of Kafka Streams and then voted on a release plan for Kafka 0.10.0. We accelerated the discussion of few key proposals in order to make the release, rolled out two release candidates, and then decided to put the release on hold in order to get few more changes in.

  • Kafka Streams tech preview! If you are interested in a new, lightweight, easy-to-use way to process streams of data, I highly recommend you take a look.
  • If you are interested in the theory of stream processing, check out Making Sense of Stream Processing download the eBook while it’s still available. The book is written by Martin Kleppmann and if you’ve been interested in Kafka and stream processing for a while, you know his work is always worth reading.
  • Wondering what will be included in 0.10.0 release? Worried if there are any critical issues left? Take a look at our release plan.
  • Pull request implementing KIP-36 was merged. KIP-36 adds rack-awareness to Kafka. Brokers can now be assigned to specific racks and when topics and partitions are created, and the replicas will be assigned to nodes based on their rack placement.
  • Pull request implementing KIP-51 was merged. KIP-51 is a very small change to the Connect REST API, allowing users to ask for a list of available connectors.
  • Pull request implementing KIP-45 was merged. KIP-45 is a small change to the new consumer API which standardizes the types of containers accepted by the various consumer API calls.
  • KIP-43, which adds support for standard SASL mechanisms in addition to Kerberos, was voted in. We will try to get this merged into Kafka in release 0.10.0.
  • There are quite a few KIPs under very active discussions:

    • KIP-4, adding an API for administrative actions such as creating new topics, requires some modifications to MetadataRequest.
    • KIP-35 adds a new protocol for getting the current version of all requests supported by a Kafka broker. This protocol improvement will make it possible to write Kafka clients that will work with brokers of different versions.
    • KIP-33 adds time-based indexes to Kafka and supporting both time-based log purging and time-based data lookup.

That’s all for now! Got a newsworthy item? Let us know. If you are interested in contributing to Apache Kafka, check out the contributor guide to help you get started.

  • Gwen Shapira is a Software Enginner at Confluent. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. She currently specialises in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, an author of books including “Kafka, the Definitive Guide”, and a frequent presenter at data related conferences. Gwen is also a committer on the Apache Kafka and Apache Sqoop projects.

Did you like this blog post? Share it now