Live Demo: Build Scalable Event-Driven Microservices with Confluent | Register Now

Presentation

kash.py - How to Make Your Data Scientists Love Real-time

« Current 2023

Implementing real-time data pipelines is still a challenge. Even more so for data scientists who have often been brought up with batch processing and files and typically have only heard of Kafka, but never really used it. Now, do you need to hire a team of Kafka experts, or is there another way?

We came up with a novel way to bridge the gap between the batch and file-based world of most data scientists and the world of real-time and streaming using a new Open Source data-processing tool called kash.py (""Kafka Shell for Python"").

kash.py allows any Python programmer and data scientist to access the Kafka API in an easier way than ever before. It offers a large number of easy-to-use abstractions on top of the Kafka API, including bash-inspired commands like ""ls"" or ""l"" for listing Kafka topics and ""cat"", ""head"" or ""tail"" for displaying topic content. kash.py bridges the gap between the file and the streaming worlds with the ""cp"" command to upload a file to a topic, or to download a topic to a file, and offers commands inspired by functional programming to do Kafka-Kafka/File-Kafka or Kafka-File stream processing (""map"", ""flatMap"", ""filter"" etc.) in one-liners - of course with full support for JSON Schema, Avro and Protobuf. Think kcat and add a lot more power and programmability.

In this session, we show you how kash.py can be used to bring the two disparate worlds of files and streaming together - and thus not only save a lot of time and money hiring real-time and streaming experts, but also make your data scientists, like ours, start loving real-time.

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how