The challenge with today’s “data explosion” is finding the most appropriate answer to the question, “So where do I put my data?” while avoiding the longer-term problem: data warehouses, data lakes, cloud storage, NoSQL databases, … are often the places where “big” data goes to die.
Enter Physics 101, and my corollary to Newton’s First Law of Motion:
• Data in motion tends to stay in motion until it comes rest on disk. Similarly, if data is at rest, it will remain at rest until an external “force” puts it in motion again.
• Data inevitably comes to rest at some point. Without “external forces”, data often gets lost or becomes stale where it lands. “Modern” architectures tend to involve data pipelines where downstream consumers of data make use of data generated upstream, often with built-for-purpose repositories at each stage. This session will explore how data that has come to rest can be put in motion again; how Kafka can keep it in motion longer; and how pipelined architectures might be created to make use of that data.