A complex data flow is a set of operations to extract information from multiple sources, copy them into multiple data targets while using extract, transformations, joins, filters, and sorts to refine the results.
These are precisely the capabilities that the new open modern data stack provides us. Spark and other tools allow us to develop complex data flow on large-scale data.
Chaos Engineering concepts discuss the principles of experimenting on a distributed system to build confidence in the system’s capability to withstand turbulent conditions in production. Or, how stable is your distributed system?
Let's see how we combine these two worlds to build more stability and reliability into our dataops.