OSS Kafka couldn’t save them. See how data streaming came to the rescue! | Watch now

Oct 11, 2016Read Time: 4 min

Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | October 2016

Written By

Gwen ShapiraEngineering Manager, Confluent

Oct 11, 2016Read Time: 4 min

This month the community has been focused on the upcoming release of Apache Kafka 0.10.1.0. Led by the fearless release manager, Jason Gustafson, we voted on a release plan, cut branches and started voting on the first release candidate. Please contribute to the community by downloading the release candidate, testing it out and letting everyone know how it went. If no serious bugs are found, we are hoping to finalize the release by mid-October.

In addition to the vote, we gave our website a quick facelift, contribution of Derrick Or. We appreciated the feedback from the community and issues were quickly addressed.

And as usual, there are several very lively discussions in the community:

KIP-74: Proposal to limit not just the amount of data returned by a consumer fetch per partition, but also the amount of data returned for each fetch request overall. This will give users better control over the memory usage of consumers, but even better – this allows consumers to make progress even if a partition contains messages larger than the maximum fetch size. This proposal has been merged and will be part of the 0.10.1.0 release.
KIP-79: Proposal to add methods for searching by timestamp to the new consumer was accepted and merged. It will be included in the next release to everyone’s great joy.
KIP-82: Proposal for adding headers to Kafka messages. This proposal is very popular because so many organizations are using headers internally. It is also controversial – Kafka project has a long tradition of keeping the message completely unstructured and letting the users and client put whatever structure they need inside the message. Whatever the decision is, it will have serious impact on the Apache Kafka ecosystem.
KIP-83: Much welcome proposal that allows to instantiate clients with different security configurations in the same JVM. There are already patches available by Rajini Sivaram and Edurdo Comar and once integrated it will allow us to update MirrorMaker to support different security configurations on source and target clusters.
KIP-85: Allowing clients to take JAAS configurations dynamically rather than via a file. This will be huge for those of us implementing microservices in containers – adding files to containers has been very inconvenient.

In addition to ongoing Kafka improvements, there are other interesting news and blogs:

Google are talking about use of Kafka in GCP and their new Kafka connectors.
Good summary of the big announcements for the Streams community from Strata.
Dean Wampler talks to O’Reilly about streams architecture.
Tutorial at Strata showing how to build customer 360 architecture using Apache Kafka, Spark Streaming and Kudu. One of the main take-aways is that modern data architectures no longer assume that all the data you need is found in one database – instead they solve the data integration problem.
How to test Kafka Streams topologies – because testing is the most important part of development.
From CapitalOne, a great StrangeLoop talk: Commander: Better Distributed Applications through CQRS, Event Sourcing, and Immutable Logs.
We made recommendations on how to move to the cloud with Kafka and added enterprise features.
Using MirrorMaker? Want to use the new Consumer? Here are some gotchas you want to be aware of.
And for the theory-inclined: Fascinating paper on graph processing on streams.

If you are interested in learning all about streaming data platforms, Confluent has released a 6-part online talk series focusing on Apache Kafka. You can view the recordings for the first two talks in the series by Jay Kreps and Jun Rao, and register for the upcoming sessions at /apache-kafka-talk-series.

Gwen Shapira is a Software Enginner at Confluent. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. She currently specialises in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, an author of books including “Kafka, the Definitive Guide”, and a frequent presenter at data related conferences. Gwen is also a committer on the Apache Kafka and Apache Sqoop projects.

Did you like this blog post? Share it now

Introducing KIP-848: The Next Generation of the Consumer Rebalance Protocol

Jun 3, 2025

Big news! KIP-848, the next-gen Consumer Rebalance Protocol, is now available in Confluent Cloud! This is a major upgrade for your Kafka clusters, offering faster rebalances and improved stability. Our new blog post dives deep into how KIP-848 functions, making it easy to understand the benefits.

Jonathan Lacefield

How to Query Apache Kafka® Topics With Natural Language

May 29, 2025

The users who need access to data stored in Apache Kafka® topics aren’t always experts in technologies like Apache Flink® SQL. This blog shows how users can use natural language processing to have their plain-language questions translated into Flink queries with Confluent Cloud.

Rahul Bhattacharya

Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | October 2016

Get started free with Confluent

Watch demo: Kafka streaming in 10 minutes

Written By

Get started free with Confluent

Watch demo: Kafka streaming in 10 minutes

Did you like this blog post? Share it now

Introducing KIP-848: The Next Generation of the Consumer Rebalance Protocol

How to Query Apache Kafka® Topics With Natural Language

Get started free with Confluent

Watch demo: Kafka streaming in 10 minutes

Did you like this blog post? Share it now

Subscribe to the Confluent blog

Introducing KIP-848: The Next Generation of the Consumer Rebalance Protocol

How to Query Apache Kafka® Topics With Natural Language