Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now

Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | July 2016

Written By

Here comes the July 2016 edition of Log Compaction, a monthly digest of highlights in the Apache Kafka and stream processing community. Want to share some exciting news on this blog? Let us know.

  • A lot of improvements have been proposed since the latest 0.10.0.0 release:
    • KIP-33 – proposed by Jiangjie Qin, will add a time log index to enhance the accuracy of various functionalities such as searching offset by timestamp, time-based log rolling and retention, etc. It has been adopted with the target release version 0.10.1.0.
    • KIP-62 – proposed by Jason Gustafson, will separate the session timeout configuration for consumer hard failure detection from the processing timeout configuration, so that users have more flexibility specifying liveness criterion for different scenarios. It has been adopted with the target release version 0.10.1.0.
    • KIP-4 – proposed by Joe Stein and led by Grant Henke, will introduce request protocols for different administration operations, such as topics / configs / ACLs, etc. The topics admin request protocols has been under busy discussions and development.
    • We have a bunch of other KIPs under discussion and voting as well, such as KIP-63 and KIP-67 for improving the Streams API in Kafka, KIP-55 and KIP-48 for adding more features into Kafka Security, etc. We would love to encourage anyone from the community who are interested in these specific topics to get involved!
  • Want to learn about the Streams API in Kafka? Read this nice blog by Michael Noll on building your first real-time stream aggregation application, and watch the presentation by Guozhang Wang at Hadoop Summit San Jose!
  • LinkedIn hosted its first-ever Stream Processing Meetup. Shuyi Chen, Cameron Lee and Shubhanhu Nagar talk about how they use Kafka and Samza as the backbones for their streaming applications, at Uber and LinkedIn.
  • Considering using Kafka to simplify your microservices? Check out Jim Riecken’s talk at Scala Days New York this month.
  • Twitter has open sourced Heron, a new distributed stream computation system after Apache Storm.
  • Kafka was BIG at Berlin Buzzwords! Checkout Neha Narkhede’s keynote on using it for application development in the new paradigm of stream processing.
  • Guozhang Wang is a PMC member of Apache Kafka, and also a tech lead at Confluent leading the Kafka Streams team. He received his Ph.D. from Cornell University where he worked on scaling data-driven applications. Prior to Confluent, Guozhang was a senior software engineer at LinkedIn, developing and maintaining its backbone streaming infrastructure on Apache Kafka and Apache Samza.

Did you like this blog post? Share it now