[Webinar] How to Implement Data Contracts: A Shift Left to First-Class Data Products | Register Now

Feb 8, 2017Read Time: 3 min

Log Compaction – Highlights in the Apache Kafka and Stream Processing Community – February 2017

Written By

Gwen ShapiraEngineering Manager, Confluent

Feb 8, 2017Read Time: 3 min

As always, we bring you news, updates and recommended content from the hectic world of Apache Kafka^® and stream processing.

Sometimes it seems that in Apache Kafka every improvement is preceded by an involved KIP process. This month we’ve merged a great patch that improved the 99% latency of Kafka without requiring user visible changes: https://issues.apache.org/jira/browse/KAFKA-4614. Not only does it make a fast system even faster, the JIRA itself is worthy of study. I wish all JIRAs included this level of research.

Some important improvements do require KIPs. Here is what we’ve seen in active discussions this month:

KIP-112: Handle disk failure for JBOD and its close relative KIP-113: Support replicas movement between log directories. Both these KIPs improve Kafka’s behavior in the common case where the broker’s data is written to a number of directly mounted disks on the broker server (rather than using RAID). With these improvements, Kafka will be able to survive failure of a single disk without taking down an entire broker, and it will allow admins to control the placement of replicas on disk – useful in cases where disks or replicas have uneven sizes.
KIP-117: Add a public AdminClient API for Kafka admin operations: This lets developers create, modify and delete topics and ACLs without using internal APIs which are subject to incompatible changes and without requiring ZooKeeper connection from the applications.
KIP-98: The famous KIP that adds transactional semantics and exactly-once to Kafka is now under voting. This means that the Wiki now contains all the public changes. If you haven’t read it yet, now is a good time.
KIP-118 suggests we remove support for Java 7 in the next major release (0.11). We don’t know yet when 0.11 will get released, but we know it will be later than June.
KIP-110 suggests adding support for a new compression codec: ZStandard Compression. The new compression, written by Facebook, looks very promising.
KIP-109 suggests marking the old consumers as deprecated, as a hint for developers that they should start migrating to the new clients. As the KIP states, the old consumers are missing important features like security that were only added in the new clients.

Notable Blogs and Presentations:

One of the basic design patterns of Microservices is creating a local cache or materialized view. Keeping the cache updated can be a challenge. Zach Cox explains the challenges in maintaining a local cache for a service and provides several solutions using different Kafka APIs.
Plumbr used Kafka to transition from a monolith to microservices as they scaled their architecture.
Sky Betting & Gaming published their Kafka-centric streaming architecture.
And since everyone loves benchmarks: Comparing the different compression codecs in Apache Kafka.
Trulia talks about how they use Kafka to drive a machine learning system, which they use to offer personalized experiences in mobile and desktop.

Gwen Shapira is a Software Enginner at Confluent. She has 15 years of experience working with code and customers to build scalable data architectures, integrating relational and big data technologies. She currently specialises in building real-time reliable data processing pipelines using Apache Kafka. Gwen is an Oracle Ace Director, an author of books including “Kafka, the Definitive Guide”, and a frequent presenter at data related conferences. Gwen is also a committer on the Apache Kafka and Apache Sqoop projects.

Did you like this blog post? Share it now

Introducing KIP-848: The Next Generation of the Consumer Rebalance Protocol

Jun 3, 2025

Big news! KIP-848, the next-gen Consumer Rebalance Protocol, is now available in Confluent Cloud! This is a major upgrade for your Kafka clusters, offering faster rebalances and improved stability. Our new blog post dives deep into how KIP-848 functions, making it easy to understand the benefits.

Jonathan Lacefield

How to Query Apache Kafka® Topics With Natural Language

May 29, 2025

The users who need access to data stored in Apache Kafka® topics aren’t always experts in technologies like Apache Flink® SQL. This blog shows how users can use natural language processing to have their plain-language questions translated into Flink queries with Confluent Cloud.

Rahul Bhattacharya

Log Compaction – Highlights in the Apache Kafka and Stream Processing Community – February 2017

Get started free with Confluent

Watch demo: Kafka streaming in 10 minutes

Written By

Get started free with Confluent

Watch demo: Kafka streaming in 10 minutes

Did you like this blog post? Share it now

Introducing KIP-848: The Next Generation of the Consumer Rebalance Protocol

How to Query Apache Kafka® Topics With Natural Language

Get started free with Confluent

Watch demo: Kafka streaming in 10 minutes

Did you like this blog post? Share it now

Subscribe to the Confluent blog

Introducing KIP-848: The Next Generation of the Consumer Rebalance Protocol

How to Query Apache Kafka® Topics With Natural Language