Confluent
Log Compaction – Highlights in the Apache Kafka® and Stream Processing Community – March 2017
Log Compaction

Log Compaction – Highlights in the Apache Kafka® and Stream Processing Community – March 2017

Gwen Shapira

Big news this month! First and foremost, Confluent Platform 3.2.0 with Apache Kafka® 0.10.2.0 was released! Read about the new features, check out all 200 bug fixes and performance improvements and then download Confluent Platform 3.2.0 and try it out.

Thanks to Ismael Juma, there is already a plan for the next release of Apache Kafka – 0.11.0.0, so you can check out the features planned for June. The big ticket items are exactly-once and transactions,  dropping support for Java 7, and disabling unclean leader election by default.

Notable KIPs this month include:

Voted:

  • KIP-107: Add purgeDataBefore() API in AdminClient – This KIP allows developers to request data purging from Kafka. This data cleanup is in addition to the usual cleanup policy which is time-based and size-based. The cleanup API is especially useful for multi-step stream processing jobs that can now remove intermediate data after it was processed by downstream jobs.
  • KIP-119: Drop Support for Scala 2.10 in Kafka 0.11 – We’ve added support for Scala 2.12 in Kafka 0.10.2.0, now it is time to remove the older version of Scala.
  • KIP-121: Add KStream peek method – A new stream DSL command. Similar to map(), but intended to produce side-effects rather than modify the events in the stream. This is useful for debugging and diagnostics: peek() can be used to update a monitoring metric or to print the current record, similar to Java 8’s Stream#peek() method.

Discussed:

  • KIP-129: Streams Exactly-Once Semantics – Now that adding exactly-once semantics and transactions to Kafka is in progress, it is time to add exactly-once processing semantics to Kafka’s Streams API.
  • KIP-122: Add Reset Consumer Group Offsets tooling – Ever had a consumer group fail on a bad record and wished you could just tell the consumer group to skip ahead a bit? So did we. Now we are discussing the best CLI to do it.
  • KIP-124 – Request rate quotas – Right now Kafka allows limiting the bandwidth that a client is allowed to produce and consume, but there is still no control over how much CPU resources a client is using. The functionality will be very useful for anyone running a multi-tenant cluster, and the discussion on how to best model CPU consumption of clients and the best ways to let administrators control it via a configuration is fascinating.
  • KIP-125: ZookeeperConsumerConnector to KafkaConsumer Migration and Rollback – We want to deprecate the old 0.8.x consumer in favor of the new consumer, but some teams have trouble migrating because there is no support for a rolling upgrade between the two consumer types. This KIP proposes a solution to this problem, allowing us to remove the old consumer.

Notable Blog posts:

Our Confluent Community Slack Channel is thriving – with 500 members and lively discussions on Apache Kafka and all ecosystem projects. The community is still new, but next month we’ll share highlights from the community discussions. You are invited to join.

And most important, we announced the agenda for Kafka Summit NYC  and a Kafka Summit hackathon. We look forward to seeing all of you there! Register now!

Subscribe to the Confluent Blog

Subscribe
Email *

More Articles Like This

log compaction
Yeva Byzek

Log Compaction – Highlights in the Apache Kafka® and Stream Processing Community – June 2017

Yeva Byzek . .

We are very excited for the GA for Kafka release 0.11.0.0 which is just days away. This release is bringing many new features as described in the previous Log Compaction ...

log compaction
Yeva Byzek

Log Compaction – Highlights in the Apache Kafka® and Stream Processing Community – May 2017

Yeva Byzek . .

We are very excited to share a wealth of streaming news from the past month! If you are looking for an ideal streaming data service that delivers the resilient, scalable ...

log compaction
Gwen Shapira

Log Compaction: Highlights in the Apache Kafka and Stream Processing Community – January 2017

Gwen Shapira . .

Happy 2017! Wishing you a wonderful year full of fast and scalable data streams. Many things have happened since we last shared the state of Apache Kafka® and the streams ...

Leave a Reply

Your email address will not be published. Required fields are marked *

Comments

  1. I am looking at funneling the log files from Confluent platform to Splunk. What is the better way? Using Confluent Kafka Connect? or Splunk VM forwards ? or Apache NIFI or any other mechanism?

Try Confluent Platform

Download Now