[Ebook] The Builder's Guide to Streaming Data Mesh | Read Now

Presentation

Build Copilots on Streaming Data with Generative AI, Kafka Streams and Flink SQL

« Kafka Summit London 2024

To unleash the power of Large Language Models (LLMs), organizations need to integrate them with their own data. Intelligent business-specific Copilots can serve as a vital link between LLMs and data streaming, enhancing developer productivity and making stream processing more accessible.

For structured data in Kafka topics, we'll demonstrate how to evolve schemas with LLMs, generating tags and descriptions. Starting from an open-source Copilot UI, we'll enable users to ask questions about streaming data in natural language and show step by step how to translate these queries with context-aware LLM prompts to Flink SQL. We'll then demonstrate how to run these statements as stream processing tasks, producing to and consuming from Kafka topics and sending messages via a websocket connection back to the Copilot UI.

For semi-structured and unstructured data (like text, JSON, PDFs, spreadsheets, and other binary files), we will explore strategies for creating data pipelines, continuously generating embeddings and storing them in vector databases for retrieval augmented generation (RAG). We will demonstrate dynamically creating consumers with Kafka Streams, managing LLM integrations, organizing metadata, as well as conducting post-processing tasks for quality assurance and for triggering actions by integrating with other systems.

In addition to developing custom Copilot UIs integrated with streaming data, we will also cover the deployment and monitoring of Copilots across various internal tools, such as Slack, Microsoft 365 Copilot, and OpenAI ChatGPT.

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how