Elevate your stream processing w/ The Force of Kafka + Flink Awakens | Read Now

Key Moments From Kafka Summit Bangalore: AI Model Inference, Freight Clusters, and Many Waves of Enthusiasm

Apply to join the AI Model Inference Early Access Program

Try AI Model Inference for Confluent Cloud for Apache Flink® to seamlessly integrate AI in your real-time applications and data pipelines

Current 2024

September 17-18, 2024 | Austin Convention Center

Written By

The  Apache Kafka® community gathered once more, this time in Bangalore, India.

Our first-ever Kafka Summit in India drew in over 2,500 participants, both in-person and online. The ambiance was nothing short of electrifying, igniting enthusiasm and momentum throughout the event

As Confluent’s Chief Product Officer, I was thrilled to open the Summit and welcome Kafka enthusiasts who came out in full force to learn about how data streaming forms the foundation for something that’s truly transformative for businesses everywhere—and how they can do more with their data, more easily than they ever have. 

As Confluent CEO and Co-founder Jay Kreps said during his presentation, “Kafka has become far and away the standard for streams. Flink is an emerging standard for real-time stream processing. Apache Iceberg® is the emerging open table standard allowing tables of data to be shared in the analytical estate. Taking these and putting them together into a coherent platform is what we think [will] make the world of streaming accessible to all the applications in companies. I couldn’t be more excited about what that future holds. I couldn’t be more excited to explore that with all of you.”

But before we dive into what went down at Kafka Summit Bangalore, it’s important to acknowledge the impressive community that has made Apache Kafka the de facto standard for data streaming. 

As my colleague Addison Huddy , VP of Product Management, said during the keynote. “All the major cloud providers, multi-billion dollar startups, and thousands of developers are working each and every day to make this project amazing. There are over 150,000 organizations using Kafka in production. Kafka has over 27,000 GitHub stars and hosted over 1,500 meetups all over the world—pretty impressive! In fact, the Kafka community just celebrated its 1000th Kafka Improvement Proposal (KIP)—that’s one thousand times the community has come together to do something amazing!”

As always, our summits are focused on delivering meaningful value to attendees, ensuring they walk away with practical insights and benefits. Here’s what Sharmistha Chatterjee, Senior VP of Advanced Analytics Chapter Lead at Commonwealth Bank of Australia and a presenter at the event, had to say:

“I got to learn about all things Kafka and how it’s being leveraged by businesses worldwide to build resilient applications and drive interesting use cases. The sessions I attended were very informative. Most importantly, I learned about how Confluent can help businesses run Kafka more efficiently in the cloud.”

Key highlights from the action-packed keynote

  • AI Model Inference in Confluent Cloud for Apache Flink®: A new capability in our serverless Flink service that enables you to seamlessly integrate AI into your stream processing workloads. AI models are elevated to first-class resources within Flink and are accessible through SQL queries, making it easy to invoke remote model endpoints like OpenAI and AWS Sagemaker. AI Model Inference is currently available as Early Access. Learn more below!

  • New auto-scaling Freight clusters: A brand new serverless cluster type with up to 90% lower cost for high-throughput use cases with relaxed latency requirements. Freight clusters are available in Early Access in select AWS regions. Interested customers can sign up for early access and check out the latest blog to learn more about this offering.

  • Tableflow: We announced our vision for Tableflow, a new feature on Confluent Cloud that allows users to convert Apache Kafka topics and associated schemas to Apache Iceberg® tables with a single click to better supply data lakes and data warehouses. Tableflow is currently available in Early Access. Interested customers can register interest for early access and check out our blog to learn more. 

  • Confluent Platform for Apache Flink®: A Flink distribution that will allow you to effortlessly harness stream processing for on-prem or private cloud workloads with long-term expert support, streamlined troubleshooting, secure updates, and easy adoption with minimal changes to your existing architecture. Confluent Platform for Apache Flink® will be available to Confluent customers later this year. 

Read on for a full overview and insights to help you make the most of these exciting new capabilities.

A Jam-Packed Day of Learning: Session Highlights

There was something for everyone at Kafka Summit Bangalore. Starting from short lightning talks designed to whet your appetite for topics like real-time GenAI with Apache Kafka to longer deep-dive sessions that covered how businesses are leveraging Kafka to drive powerful innovations and use cases by solving some of their most pressing data challenges. 

Real-Time GenAI with Apache Kafka: Chatbot for Ambulance Booking

GenAI has opened up a plethora of opportunities. Amruta Agnihotri, Software Architect at Spring Computing Technologies, took the stage to talk about how they built a GenAI powered chatbot for a healthcare customer that was looking for a smart solution to help patients book ambulances without hassles. “Apache Kafka played a critical role in creating that hospital 360-degree view that was needed for the GenAI component,” Amruta said. “Creating that unified view gave us the real-time context or the business data that was required to bridge the gap between the pre-trained large language models and hospital data. This consolidated data forms the backbone for the conversational chatbot, enabling it to deliver accurate and efficient results.”

Brick-by-Brick: Exploring the Elements of Apache Kafka

Our very own Danica Fine, Staff Developer Advocate at Confluent, hosted a lego-themed Kafka 101 session that was the epitome of fun learning. Attendees learned the ins and outs of the components that form the basis of Kafka, how those click together, and what users can build with them—and also got a glimpse of what’s beyond Kafka. “There’s a ton of great reasons to be using Kafka,” Danica said. “We are undergoing a paradigm shift in how we deal with our data, move our data, store our data, and how we process our data. And Kafka is at the core of that: being able to move our data quickly and efficiently so that we can do whatever we need to do with it in the easiest way possible.”

Lace Up: Take Your First Steps with Flink SQL for Kafka Developers

In this session hosted by Shindy Lall, Senior Product Manager at Confluent, attendees got to learn about the potential of Flink SQL and what makes it well-suited for most stream processing use cases, particularly building real-time data products and pipelines. 

“Developers love Flink,” Shindy said. “Flink is built to scale. The infrastructure is designed to handle tremendous workloads. It’s fault tolerant. It can handle failures effectively and provide high availability. It’s very flexible. You can use Flink with Java, Python, SQL. So depending on what type of developer or analyst  you are, you can use it with your language of preference.”

Optimizing Millions of Heartbeats on Zee OTT Platform

In this session, presented by Zee Entertainment Enterprises’s Principal Architect Srinivas Shanmugam and Tech Lead Jivesh Threja, attendees learned how Zee migrated from a cloud-based to a cloud-agnostic architecture of watch history, for its ZEE5 OTT platform, with zero downtime. 

The duo shared insights into how the media giant uses Confluent Cloud to handle heartbeat information coming from millions of devices to power personalized home pages for each customer, dove into the details of their heartbeat architecture, and highlighted challenges and learnings along the way. 

They also shed light on what not having to self-manage Kafka has enabled: “It ensures 100% of our focus goes on app development.”

Women in Technology Lunch: Becoming Effective Leaders

Hosted by Confluent's Women's Inclusion Network, the lunch provided a great opportunity to network and hear from a panel of women leaders who shared their career journeys and offered advice on becoming effective leaders. Advice ranged from finding your own personal champions and mentors to being fearless and speaking up.

Sharmishta Chatterjee, Commonwealth Bank of Australia: “It’s very important for leaders to be empathetic and to learn to listen to their teams. To be successful as a team, leaders also need to identify where their team has more capabilities and where they can nurture them.” 

Mamta Singh, Microsoft: “First, build trust within your team. Second, find different learning opportunities… and create a path for innovation and creativity. Third, build inclusive and diverse teams.”

Smita Ojha, Mindtickle: “Be conscious of what brand you build for yourself. Building that brand starts with identifying  your area of interest and motivation. Then build expertise within that area. You have to then grab the opportunities that come your way to voice your opinion. But establishing that core gives you confidence and helps create a brand for yourself.” 

Sumita Bhattacharjee, Confluent: “A leader is someone who people want to follow. How do you create a following? Start by doing work that is truly inspirational.”

AI Model Inference in Confluent Cloud for Apache Flink®

Traditional AI development involves separate tools and languages for working with AI/ML models and preparing data for those models, causing complexity and inefficiencies.

By adding support for AI Model Inference, Confluent Cloud for Apache Flink® allows you to simplify the development and deployment of AI/ML applications by providing a unified platform for both data processing and AI/ML tasks.

Our Flink service allows you to:

  • Simplify development by using familiar SQL syntax to work directly with AI/ML models, reducing the need for specialized ML tools and languages

  • Enable seamless coordination between data processing and ML workflows to improve efficiency and reduce operational complexity

  • Enable accurate, real-time AI-driven decision-making by leveraging fresh, contextual streaming data to enable scenarios like Retrieval Augmented Generation (RAG)

By working with AI models directly as first-class resources within Flink, you can now utilize them within your SQL queries using a familiar syntax for data processing. This approach enables you to create and manage remotely hosted AI models using SQL data definition language (DDL) statements, eliminating the need to interact with the underlying infrastructure.

CREATE MODEL 'my_remote_model'
INPUT (f1 INT, f2 STRING)
OUTPUT (label STRING, probs ARRAY[FLOAT])
WITH(
  'type' = 'remote',
  'task' = 'classification',
  'provider' = 'OPENAI',
  'endpoint' = 'https://api.openai.com/v1/llm/v1/chat',
  'secret_id' = 'my_secret',
)

You can then call remote AI model endpoints, such as OpenAI, GCP Vertex AI, AWS SageMaker, and Azure, and receive inference results directly in your Flink jobs. 

SELECT f1, f2, label
FROM ML_PREDICT('my_data', 'my_remote_model', DESCRIPTOR(f1, f2))

Note: Support for AI model inference is currently in Early Access, meaning it is meant for testing/experimentation purposes and only available to a limited number of candidates. Please apply to the Early Access program if interested.

Confluent Platform for Apache Flink®

Self-managing Flink, like Kafka, can be immensely challenging. Relying on community support lacks the immediate responsiveness needed for mission-critical applications, while support from separate vendors for each streaming technology leads to delays and confusion in coordinating issue resolution. Moreover, the Flink community only actively maintains its two most recent releases without providing long-term support for specific versions.

With Confluent Platform for Apache Flink®, a Flink distribution fully supported by Confluent, customers will be able to easily leverage stream processing for on-prem or private cloud workloads with long-term expert support beyond what’s provided by the open-source project.

Our enterprise-grade Flink distribution will enable you to:

  • Minimize risk with consolidated Flink and Kafka support and guidance from the foremost experts in the data streaming industry

  • Receive timely assistance in troubleshooting and resolving issues, reducing the impact of any operational disruptions

  • Ensure that your stream processing applications are secure and up-to-date with off-cycle bug and vulnerability fixes

Rather than only maintaining the two most recent releases, Confluent Platform for Apache Flink will provide three years of support for each release from its launch. Our comprehensive support SLA will ensure swift resolution for critical Sev1 issues to minor Sev3 concerns. By consolidating support for Flink and Kafka with a single vendor, you will be able to streamline the support process, ensure better integration and compatibility between the two technologies, and receive more comprehensive support for your entire streaming project. 

You’ll also be able to easily attach Apache Flink to Confluent Platform with minimal changes to existing Flink jobs and architecture, simplifying integration and paving the way for seamless cloud migration. Future enhancements will deepen integration between Confluent Platform and Flink and focus on improving job lifecycle and security management.

Note: This Confluent-supported Flink distribution will be offered in Limited Availability as part of an upcoming release of Confluent Platform. Limited Availability means it will be fully supported for production workloads but only available to select customers. Feel free to contact us if you have any questions.

Until Next Time

Although this was our inaugural conference In India, it’s not going to be the last one. 

And as Jay said, there’s a lot happening in the data streaming space and we are going to be doing it again. “We want to broaden the content beyond Kafka—and that’s what we’ve done in the U.S. in the last few years—with [our data streaming summit] Current. All the Kafka content is still there but we brought in stream processing, talked about the governance of data, talked about use of AI in real time. And next year, Current will be coming to Bangalore and we are all excited to host you soon.”

Don’t want to wait until next year to experience Current? Here’s the good news. Come join us in Austin, Texas for Current 2024 this September—where you can look forward to more action-packed days of sessions presented by data streaming experts on every topic imaginable.

  • Shaun Clowes is Chief Product Officer at Confluent

  • Mekhala Roy is a senior writer on the Brand Marketing team at Confluent. Prior to Confluent, Mekhala has worked in the cybersecurity industry—and also spent several years working as a tech journalist.

Apply to join the AI Model Inference Early Access Program

Try AI Model Inference for Confluent Cloud for Apache Flink® to seamlessly integrate AI in your real-time applications and data pipelines

Current 2024

September 17-18, 2024 | Austin Convention Center

Did you like this blog post? Share it now