Confluent
Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | August 2016
Log Compaction

Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | August 2016

Gwen Shapira

It is August already, and this marks exactly one year of monthly “Log Compaction” blog posts – summarizing news from the very active Apache Kafka and stream processing community. Hope you enjoy them and as usual, let us know if you have news to share.

  • The Apache Kafka community is preparing to release a bugfix for version 0.10.0. The new release will be 0.10.0.1 and we are currently voting on a release candidate – hopefully we won’t find critical issues and the release will be available soon.
  • The on-going work on KIP-4 has seen significant progress. This work will allow all client libraries to manage topics without depending on core Kafka or Zookeeper:
  • API to create new topics through the wire protocol was voted in and committed
  • API to delete topics was voted in and a patch is currently under review
  • API to manage ACLs is currently under discussion
  • KIP-67 – adding queryable state to Kafka Streams was voted in and committed. This new feature will allow other applications to directly query the latest processing results of your Kafka Streams application (i.e. its current state).  This means that, for many use cases, you no longer need to operate and interface with external systems or databases to share data between applications. The result is a simplified, more app-centric architecture.
  • Michael Noll published two more blogs on Kafka Streams: Secure stream processing and Elastic Scaling in Kafka Streams.
  • Alex Loddengaard published his best practices for running Apache Kafka in AWS. There have been tons of questions in the community about this topic as cloud deployments are becoming more and more popular – so we shared our answers in this blog post.
  • Spark 2.0 was released last week with many improvements to Spark Streaming. This blog post gives an overview of what’s new in Spark Streamng.
  • Back in April, when we ran the Kafka Connect and Streams hackathon, one of my favorite projects was by SVDS. They streamed data from a Bluetooth brain monitoring device to Kafka, used Kafka Connect to stream data out to OpenTSDB, and then used Grafana to visualize the brain activity! How cool is that? SVDS blogged all the fun details of their brain monitoring project for your inspiration.
  • Recommendation powerhouse Yelp blogged about their real-time data pipeline architecture – and it is gorgeous. We recommend checking it out as a reference for anyone tasked with building similar infrastructure.
  • Apache Kafka training is offered by Confluent and our partners. New classes have just been published, including online-based training. www.confluent.io/training.
    Untitled_design.jpg

Subscribe to the Confluent Blog

Subscribe
Email *

More Articles Like This

Florian Troßbach

Crossing the Streams – Joins in Apache Kafka

Florian Troßbach . .

This post was originally published at the Codecentric blog with a focus on “old” join semantics in Apache Kafka versions 0.10.0 and 0.10.1. Version 0.10.0 of the popular distributed streaming ...

Neha Narkhede

Exactly-once Semantics are Possible: Here’s How Kafka Does it

Neha Narkhede . .

I’m thrilled that we have hit an exciting milestone the Kafka community has long been waiting for: we have  introduced exactly-once semantics in Apache Kafka in the 0.11 release and ...

etl_mess
Yeva Byzek

Building a Real-Time Streaming ETL Pipeline in 20 Minutes

Yeva Byzek . .

There has been a lot of talk recently that traditional ETL is dead. In the traditional ETL paradigm, data warehouses were king, ETL jobs were batch-driven, everything talked to everything ...

Leave a Reply

Your email address will not be published. Required fields are marked *

Try Confluent Platform

Download Now