New in Confluent Cloud: Making Data & Pipelines Accessible for AI-Ready Streaming | Learn More

Dec 6, 2016Read Time: 2 min

Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | December 2016

Dec 6, 2016Read Time: 2 min

This month saw the proposal of a few KIPs which will have a big impact on Apache Kafka’s semantics as well as Kafka’s operability.

KIP-95 : Incremental Batch Processing for Kafka Streams brings the first hint of batch processing to Kafka Streams and brings us a step closer to unifying batch and stream processing around the log. It enables streams tasks to ‘auto-stop’ when they reach the end of the log such that periodic invocations would process batches of messages.
KIP-97 : Improved Kafka Client RPC Compatibility Policy will enable newer versions Java Kafka clients to talk with older broker versions. Previously old clients could talk with newer brokers, but not vice-versa. Amongst other things, this KIP will make upgrading Kafka clusters easier since client and server upgrades can be decoupled.
KIP-98 : Exactly Once Delivery and Transactional Messaging has been proposed and is in discussion. This KIP is a major addition and will bring idempotent message production as well as transactional semantics to Kafka — features which have been long requested and heavily discussed over the years. The KIP has an associated detailed design document, so make sure you have plenty of coffee on tap!

Lots of interesting happenings occurred in the wider streaming community as well, notably:

Confluent announced two Kafka Summits for 2017! Connect with the Apache Kafka community at Kafka Summit NY on May 8 or Kafka Summit SF on August 27. Early Bird Registration and Call for Papers are now open for both events, for more details, visit https://kafka-summit.org/
Drizzle : a streaming system which tries to provide reliably low latency even in exceptional circumstances such as failures, task recovery, rebalancing, etc.
Carloe Gunst shared insights on using Kafka to move data from a mainframe to DataLake.
Dave Tucker gave a talk on streaming operational data using Kafka at Couchbase Connect 2016. He also published a guide with best practices for serious connector developers.
Paypal published a great story on how data pipelines evolve and how they moved from Big Data to Fast Data.
If you are interested in Spring, here’s a short tutorial on using Spring with Kafka: http://www.java-allandsundry.com/2016/11/spring-kafka-producerconsumer-sample.html