On Track with Apache Kafka: Building a Streaming ETL Solution with Rail Data

OSS Kafka は役に立ちませんでした。データストリーミングによって解決した方法をご覧ください。| 今すぐ視聴

今すぐ登録

Available On-Demand

As data engineers, we frequently need to build scalable systems working with data from a variety of sources and with various ingest rates, sizes, and formats. This talk takes an in-depth look at how Apache Kafka can be used to provide a common platform on which to build data infrastructure driving both real-time analytics as well as event-driven applications.

Using a public feed of railway data it will show how to ingest data from message queues such as ActiveMQ with Kafka Connect, as well as from static sources such as S3 and REST endpoints. We'll then see how to use stream processing to transform the data into a form useful for streaming to analytics in tools such as Elasticsearch and Neo4j. The same data will be used to drive a real-time notifications service through Telegram.

If you're wondering how to build your next scalable data platform, how to reconcile the impedance mismatch between stream and batch, and how to wrangle streams of data—this talk is for you!

Robin MoffattPrincipal DevEx Engineer

Robin は Confluent の DevRel チームで働いています。彼のデータエンジニアリングのキャリアは、COBOL を使ってメインフレーム上にデータウェアハウスを構築するところから始まりました。その後 Oracle のアナリティクスソリューションの開発を経て、近年では Kafka エコシステムとモダンなデータストリーミングの分野で活躍しています。仕事以外では、ランニングやおいしいビール、揚げ物中心の朝食が大好きです（もっとも、これらを同時に楽しむことはめったにありませんが）。