Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now
For quite some time, I had a fuzzy feeling that I didn’t really understand event streaming architectures and how they fit more broadly into the modern software architecture puzzle. Then I saw a concrete, real-life example from an airplane maintenance use case, where billions of sensor data points come in via Kafka and must be transformed into insights that occasionally lead to important actions a mechanic needs to take.
This story led to a personal revelation: Data-streams are passive in nature. On their own, they do not lead to any action. But at some point in time, actions must be taken. The action might be carried out by a human looking at data and reacting to it, or an external service that’s called, or a "traditional" database that’s updated, or a workflow that’s started. If there’s never any action, your stream is kind of useless.
Now, the transition from a passive stream to an active component reacting to an event in the stream is very interesting. It raises a lot of questions about idempotency, scalability, and the capability to replay streams with changed logic. For example, in the project mentioned above, we developed our own stateful connector that starts a workflow for a mechanic only once for every new insight, but can also inform that workflow if the problem no longer exists. Replaying streams with historic data did not lead to any new workflows created.
In this talk, I’ll walk you through the aircraft maintenance case study in as much detail as I can share, along with my personal discovery process, which I hope might guide you on your own streaming adventures.