KIP-33 – proposed by Jiangjie Qin, will add a time log index to enhance the accuracy of various functionalities such as searching offset by timestamp, time-based log rolling and retention, etc. It has been adopted with the target release version 0.10.1.0.
KIP-62 – proposed by Jason Gustafson, will separate the session timeout configuration for consumer hard failure detection from the processing timeout configuration, so that users have more flexibility specifying liveness criterion for different scenarios. It has been adopted with the target release version 0.10.1.0.
KIP-4 – proposed by Joe Stein and led by Grant Henke, will introduce request protocols for different administration operations, such as topics / configs / ACLs, etc. The topics admin request protocols has been under busy discussions and development.
We have a bunch of other KIPs under discussion and voting as well, such as KIP-63 and KIP-67 for improving the Streams API in Kafka, KIP-55 and KIP-48 for adding more features into Kafka Security, etc. We would love to encourage anyone from the community who are interested in these specific topics to get involved!
Want to learn about the Streams API in Kafka? Read this nice blog by Michael Noll on building your first real-time stream aggregation application, and watch the presentation by Guozhang Wang at Hadoop Summit San Jose!
LinkedIn hosted its first-ever Stream Processing Meetup. Shuyi Chen, Cameron Lee and Shubhanhu Nagar talk about how they use Kafka and Samza as the backbones for their streaming applications, at Uber and LinkedIn.
Considering using Kafka to simplify your microservices? Check out Jim Riecken’s talk at Scala Days New York this month.
Twitter has open sourced Heron, a new distributed stream computation system after Apache Storm.
Kafka was BIG at Berlin Buzzwords! Checkout Neha Narkhede’s keynote on using it for application development in the new paradigm of stream processing.
Classic relational database management systems (RDBMS) distribute and organize data in a relatively static storage layer. When queries are requested, they compute on the stored data and then return results
Machine learning on real-time data is a powerful combination because you gain direct insights into your data, can make powerful decisions, and consequently improve your business processes and outcomes. It