New in Confluent Cloud: Making Data & Pipelines Accessible for AI-Ready Streaming | Learn More

Presentation

Bayer Document Stream Processing

« Kafka Summit Europe 2021

Bayer selected Apache Kafka as the primary layer for a variety of document streams flowing through several text processing and enrichment steps. Every day, Bayer analyzes numerous documents including clinical trials, patents, reports, news, literature, etc. We will give an idea about the strategic importance, peek into future challenges and we will provide an end-to-end technical overview.

Throughout the discussion, we will look at challenges we handle in the platform and discuss respective solutions. Among others, we discuss our approach to continuously pull in data from a variety of external sources and how we harmonize different formats and schemas. We discuss large document processing and error handling, which allows efficient debugging while not blocking the pipeline.

Then, we take on the user’s perspective and demo the platform. One will learn how users create new document processing pipelines and how Bayer keeps track of the many running Kafka pipelines.

Bayer Document Stream Processing

Presenter

Astrid Rheinländer

Presenter

Christoph Böhm

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how