New in Confluent Cloud: Making Data & Pipelines Accessible for AI-Ready Streaming | Learn More

Oct 12, 2015Read Time: 2 min

Log Compaction | Highlights in the Kafka and Stream Processing Community | October 2015

Written By

Gwen ShapiraEngineering Manager, Confluent

Oct 12, 2015Read Time: 2 min

The amount of work that got done by the community in the last month is truly impressive, especially considering how many conferences took place in September. Let’s take a look at the highlights:

The Apache Kafka community decided to rename the next release 0.9.0.0 (previously 0.8.3.0), without a change in scope. There are large features planned for the next release (including authentication, authorization, and the new consumer), so a minor release bump is in order.
The initial patch for Kafka Streams, lightweight data processing library for Apache Kafka, was committed to trunk. Take a look, try it out, and contribute bugs, suggestions, documentation, or patches.
The Apache Kafka community released version 0.8.2.2 – bug fix release for Kafka 0.8.2 with 2 critical fixes for those using Snappy in the producer (which should be all of you). You know where to get it. In conjunction, Confluent released Confluent Platform 1.0.1, which upgrades the stream data platform to the latest version of Kafka 0.8.2.2.
Three very important proposals that will modify and evolve message and file formats are currently being discussed in the Kafka community:
- KIP-31: Move to relative offsets in compressed message sets
- KIP-32: Add CreateTime and LogAppendTime to Kafka messages
- KIP-33: Add time-based log index
  Note how together they add the ability to look up specific messages by time. A feature that was long requested by the Kafka community.
Authorization patches were merged into trunk. Topic level authorization will be a new feature in Kafka 0.9.0.0 enabling use of Kafka in secure multi-tenant environments.
Apache Kafka documentation is now stored in trunk (in addition to the dedicated site repository). Contributors are encouraged to submit documentation pull requests and update relevant documentation and code in the same PR.
The Apache Spark community released version 1.5.0 – SparkStreaming’s Kafka direct connector, which gives exactly-once processing in SparkStreaming) is no longer considered experimental.

Conferences:

The first ever Kafka Summit was announced and the call for proposals is open! This is a fantastic opportunity to share your use case, experience, tips, and cool hacks with the Apache Kafka community. So don’t be shy, submit your ideas: https://kafka-summit.org/cfp.html
Apache Flink community conference, FlinkForward, is happening in Berlin on Oct 12-13, with many interesting presentations.

It’s awesome to see all of the improvements coming in from the Kafka community and I’m looking forward to seeing even more this month as we are getting closer to Apache Kafka 0.9.0.0 release.

Log Compaction is a monthly digest of highlights in the Apache Kafka and stream processing community.

Download the Confluent Platform

Gwen Shapira is a Software Enginner at Confluent. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. She currently specialises in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, an author of books including “Kafka, the Definitive Guide”, and a frequent presenter at data related conferences. Gwen is also a committer on the Apache Kafka and Apache Sqoop projects.