Confluent
Log Compaction | Kafka Summit Edition | May 2016
Log Compaction

Log Compaction | Kafka Summit Edition | May 2016

Gwen Shapira.

Last week, Confluent hosted Kafka Summit, the first ever conference to focus on Apache Kafka and stream processing. It was exciting to see the stream processing community coming together in one event to share their work and discuss possible improvements. The conference sold out several weeks in advance and over 550 Kafka enthusiasts attended.

The sessions overall were well received thanks to all of the speakers that put in time and effort to contribute to the high quality of the conference – a special thanks to the speakers! I’d like to highlight few of the sessions and discussions that attendees were especially excited about.

Hacking on Kafka Connect and Kafka Streams

On the Monday evening before the the conference we held a Stream Data Hackathon. The room was packed with over 100 participants hacking away on experimental stream processing projects. There were many awesome projects and we will publish a separate blog post to share all of them. The winning projects combined creativity and usefulness:

  • Real-time sentiment analysis of tweets, used to evaluate and visualize how twitter collectively feels about the presidential candidates in the US. Both Kafka Connect and Kafka Streams were used to implement this project. The project is by Ashish Singh from Cloudera.
  • Measure electrical activity from the brain using a bluetooth device and using Kafka to stream the data to OpenTSDB and visualizing it with Grafana. The project is by a team from Silicon Valley Data Science.
  • Kafka Connector for streaming events from Jenkins to Kafka, in order to collect all the events regarding Jenkins Jobs in an organization to one central location. The project is by Aravind Yarram from Equifax.

 

summit_logo-01_640x173.png

 

Keynote Sessions

The next day opened with a gourmet breakfast, immediately followed by three keynote talks. Neha Narkhede gave a wonderful overview of the growth of the Apache Kafka project and community since she and the other Kafka co-creators (Jay Kreps, Jun Rao, and others) started the project at LinkedIn. Then Jay Kreps shared his thoughts on the future of stream processing and how this new paradigm will change the way companies use data. Last (but not least) Aaron Schildkrout, Uber’s head of data and marketing (I love this title) discussed the ways his company uses Kafka and how their use cases are evolving. It’s pretty inspiring to think of drivers getting real-time feedback on how their driving from their phones.

Breakout Sessions

After the keynote session, we headed to the 28 breakout sessions across three tracks:

  • Systems Track – focused on stream processing
  • Operations Track – how to run Kafka in production
  • Users Track – use cases and architectures

After the conference I asked some of the attendees what were their favorite sessions.

In the Systems track, the attendees loved “
Fundamentals of Stream Processing with Apache Beam” by Frances Perry and Tyler Akidau from Google. I’ve heard many attendees discuss how this presentation changed the way they think about stream processing applications. “Introducing Kafka Streams: Large-scale Stream Processing with Kafka” by Neha Narkhede was also incredibly popular, and many attendees are looking forward to the imminent release of Apache Kafka 0.10.0 which will include Kafka Streams.

In the Operations track, attendees enjoyed “101 Ways to Configure Kafka – Badly”, by Henning Spjelkavik & Audun Strand from Finn.no, who shared all the mistakes they made as new Kafka users and how they corrected them. This presentation was a great mix of entertainment and education, and I’m sure no one who attended the session will end up with an 8-node ZooKeeper cluster.

In the Users track, attendees loved “Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData” by Anton Gorshkov from Goldman Sachs, who developed a stream processing application, live, including processing SMS messages sent by the audience in real time.

Video Recordings and Photos

Yes, we did record the sessions and they will be available in a week or so. I highly recommend checking them out. Links to the video recordings will be added to each of the session pages on www.kafka-summit.org. Follow @ConfluentInc on Twitter and we’ll let you know as soon as they are ready. 

We’ll also post some photos from the conference soon on the Confluent Facebook page.

Networking

As it often happens at conferences, the sessions don’t tell the whole story. One of the highlights of the conference for me, was to interact and exchange ideas with the leaders of many different stream processing technologies. How often does it happen that leaders of Apache Storm, Apache Spark, Apache Flink, Apache Beam, and Apache Kafka get together to discuss abstractions, concepts, how to benchmark streams, and the best ways to educate an audience? Kafka Summit is, to the best to my knowledge. It’s the only conference where the community gets together and shares their vision.

The Confluent team is looking forward to hosting Kafka Summit again next year. If you weren’t able to make it last week, fill out the Stay-In-Touch form on the home page of www.kafka-summit.org and you’ll get updates about next year’s conference.

Thanks again to all that made it to Kafka Summit 2016 in San Francisco last week! The Confluent team enjoyed meeting everyone and we had a fantastic time!

Quick note on the next Apache Kafka release

A new release candidate for version 0.10.0 has been posted to the Apache Kafka mailing lists and a new vote was started. This release candidate actually contains two new features: Support for additional SASL authentication mechanisms (KIP-43) and a new API for clients to determine features supported by the brokers (KIP-35).

Subscribe to the Confluent Blog

Subscribe
Email *
[ssba]

More Articles Like This

kafka-summit-logo
Neha Narkhede

Announcing the 2017 Kafka Summits!

Neha Narkhede . .

This year, we were pleased to host the inaugural Kafka Summit, the first global summit for the Apache Kafka community. Kafka Summit 2016 contributed valuable content to help Kafka users ...

syncsort-blog-1
Paige Roberts

Confluent Streaming Platform and Syncsort Data Management: Bringing Big Data to Life

Paige Roberts . .

The following post is a guest blog by Paige Roberts, Product Manager, Syncsort. Paige spent 19 years in the data management industry in a wide variety of roles – programmer, analyst, trainer, ...

ogges20-350x131
Robin Moffatt

Streaming data from Oracle using Oracle GoldenGate and Kafka Connect

Robin Moffatt . .

This is a guest blog from Robin Moffatt. Robin Moffatt is Head of R&D (Europe) at Rittman Mead, and an Oracle ACE. His particular interests are analytics, systems architecture, administration, and ...

Leave a Reply

Your email address will not be published. Required fields are marked *

Try Confluent Platform

Download Now