Eine Echtzeit-Brücke in die Cloud bauen – mit Confluent Platform 7.0 und Cluster Linking | Blog lesen

Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning without a Data Lake

Machine Learning (ML) is separated into model training and model inference. ML frameworks typically use a data lake like HDFS or S3 to process historical data and train analytic models. But it’s possible to completely avoid such a data store, using a modern streaming architecture.

This talk compares a modern streaming architecture to traditional batch and big data alternatives and explains benefits like the simplified architecture, the ability of reprocessing events in the same order for training different models, and the possibility to build a scalable, mission-critical ML architecture for real time predictions with muss less headaches and problems.

The talk explains how this can be achieved leveraging Apache Kafka, Tiered Storage and TensorFlow.

Moderator

Kai Waehner

Kai Waehner works as Technology Evangelist at Confluent. Kai’s main area of expertise lies within the fields of Big Data Analytics, Machine Learning / Deep Learning, Cloud / Hybrid Architectures, Messaging, Integration, Microservices, Stream Processing, Internet of Things and Blockchain. He is a regular speaker at international conferences such as Kafka Summit, O’Reilly Software Architecture or ApacheCon, writes articles for professional journals, and shares his experiences with new technologies on his blog (www.kai-waehner.de/blog). Contact and references: contact@kai-waehner.de / @KaiWaehner / www.kai-waehner.de / LinkedIn (https://www.linkedin.com/in/kaiwaehner).