Log Compaction – Highlights in the Apache Kafka™ and Stream Processing Community – March 2017
Log Compaction

Log Compaction – Highlights in the Apache Kafka™ and Stream Processing Community – March 2017

Gwen Shapira.

Big news this month! First and foremost, Confluent Platform 3.2.0 with Apache KafkaTM was released! Read about the new features, check out all 200 bug fixes and performance improvements and then download Confluent Platform 3.2.0 and try it out.

Thanks to Ismael Juma, there is already a plan for the next release of Apache Kafka –, so you can check out the features planned for June. The big ticket items are exactly-once and transactions,  dropping support for Java 7, and disabling unclean leader election by default.

Notable KIPs this month include:


  • KIP-107: Add purgeDataBefore() API in AdminClient – This KIP allows developers to request data purging from Kafka. This data cleanup is in addition to the usual cleanup policy which is time-based and size-based. The cleanup API is especially useful for multi-step stream processing jobs that can now remove intermediate data after it was processed by downstream jobs.
  • KIP-119: Drop Support for Scala 2.10 in Kafka 0.11 – We’ve added support for Scala 2.12 in Kafka, now it is time to remove the older version of Scala.
  • KIP-121: Add KStream peek method – A new stream DSL command. Similar to map(), but intended to produce side-effects rather than modify the events in the stream. This is useful for debugging and diagnostics: peek() can be used to update a monitoring metric or to print the current record, similar to Java 8’s Stream#peek() method.


  • KIP-129: Streams Exactly-Once Semantics – Now that adding exactly-once semantics and transactions to Kafka is in progress, it is time to add exactly-once processing semantics to Kafka’s Streams API.
  • KIP-122: Add Reset Consumer Group Offsets tooling – Ever had a consumer group fail on a bad record and wished you could just tell the consumer group to skip ahead a bit? So did we. Now we are discussing the best CLI to do it.
  • KIP-124 – Request rate quotas – Right now Kafka allows limiting the bandwidth that a client is allowed to produce and consume, but there is still no control over how much CPU resources a client is using. The functionality will be very useful for anyone running a multi-tenant cluster, and the discussion on how to best model CPU consumption of clients and the best ways to let administrators control it via a configuration is fascinating.
  • KIP-125: ZookeeperConsumerConnector to KafkaConsumer Migration and Rollback – We want to deprecate the old 0.8.x consumer in favor of the new consumer, but some teams have trouble migrating because there is no support for a rolling upgrade between the two consumer types. This KIP proposes a solution to this problem, allowing us to remove the old consumer.

Notable Blog posts:

Our Confluent Community Slack Channel is thriving – with 500 members and lively discussions on Apache Kafka and all ecosystem projects. The community is still new, but next month we’ll share highlights from the community discussions. You are invited to join.

And most important, we announced the agenda for Kafka Summit NYC  and a Kafka Summit hackathon. We look forward to seeing all of you there! Register now!

Subscribe to the Confluent Blog

Email *

More Articles Like This

log compaction
Gwen Shapira

Log Compaction: Highlights in the Apache Kafka and Stream Processing Community – January 2017

Gwen Shapira . .

Happy 2017! Wishing you a wonderful year full of fast and scalable data streams. Many things have happened since we last shared the state of Apache Kafka™ and the streams ...

log compaction
Apurva Mehta

Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | December 2016

Apurva Mehta . .

This month saw the proposal of a few KIPs which will have a big impact on Apache Kafka’s semantics as well as Kafka’s operability. KIP-95 : Incremental Batch Processing for ...

log compaction
Gwen Shapira

Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | November 2016

Gwen Shapira . .

Last month the Apache Kafka community released version, the announcement blog contains a good description of new features and major improvements. In other exciting news, the PMC for Apache ...

Leave a Reply

Your email address will not be published. Required fields are marked *

Try Confluent Platform

Download Now