So geht's: Aufbau von Daten-Streaming-Pipelines für Real-Time Data Warehousing | Jetzt anmelden
Capturing tech trends has become a bit tricky these days: whatever industry you’re in, uncertainty abounds. That’s made planning more difficult, but businesses are finding new ways to innovate with emerging technology and respond quickly to fast-changing market conditions. As we look to a new year, some common areas of challenges and growth are rising to the surface. For starters, the technology that’s succeeding these days is expected to tackle reality in, well, real time. That’s because businesses are moving fast to give users what they demand, and the old ways of capturing and storing data aren’t cutting it anymore.
Here’s what to keep an eye on for 2023 and beyond. What’s on your radar? Let us know.
We’ve moved from “digital transformation” as the buzzword du jour into a more nuanced understanding of what transformation looks like at an enterprise level. It’s about data now, with some key capabilities emerging: data should be shareable across an organization, not just for developers. More and more, it needs to be processed by streaming, not batch, to keep ahead of what users need. It also needs to be governed and allow for self-service access, so emerging data technology has to be easy to use.
“In the past, data was a fixed point in time to be stored. That got us here,” Confluent co-founder and CEO Jay Kreps said during his Current 2022 keynote. “But reality isn’t some static fixed thing. Today’s use cases demand streaming data.”
Missed Current 2022? We've got you covered. Get caught up with all the recorded sessions.
Data streaming adoption keeps getting easier, though, making it easier to go cloud-first and get out of the business of data streaming platform operations. Then the focus can be on what really matters—business innovation.
Data transformation is what will allow teams to focus on outputs and business value, not infrastructure or service management. The ultimate goal? Data as a product, so that teams can access the data they need, when they need it, securely.
If data is going to truly power your business, it needs to be treated like the valuable asset it is. More than a technology shift, this is a mindset shift. It requires you to treat your data as if it were a high quality, ready-to-use product that’s instantly accessible across the organization. Then it’s consistent everywhere, which means that everyone is using the same data and taking advantage of the latest and greatest data.
When data is a first-class citizen, operational systems can serve customers better, analytical systems meet the demands of your stakeholders, and SaaS applications are always up to date. It’s governed, which means you can track where the data is coming from, where it’s going, and who has access to it. Keeping track of initial data quality is essential to data taking a leading role, as is tracking its lineage. Eventually, data assets are discoverable through contracts, so whoever needs access to the data in whatever format can easily subscribe and use it on-demand.
The result of applying this kind of product thinking to your data? It accelerates use case delivery and innovation.
Making data self-serve and accessible is foundational to the bigger goal of data as a product. The big question in data governance is: who is consuming what? It’s data, but it’s also all of the copies of that data. In the coming year, streaming data governance will be non-negotiable. “Right now, you can either go fast or be safe,” said Chad Verbowski, senior vice president of engineering at Confluent, during his Current 2022 keynote. “Accomplishing both will be key for business success.”
To connect your data silos with data streaming, you need to govern and secure it to meet enterprise and regulatory compliance requirements. Look for role-based access controls, compliance policy management, and data lineage and auditability.
At the moment, according to IDC’s 2022 Data Trust Survey, respondents know how important trusted data is. In fact, more than 75% said that high data trust levels have a positive impact on customer satisfaction. At the same time, they’re still trying to build the infrastructure to make sure data is both trusted and broadly accessible. But only 17% of respondents have a complete architecture built for managing and controlling data. And streaming data is one of the least trusted sources of data.
Moving toward a data-as-a-product mindset will require broad trust, with companies using trusted platforms for broad streaming governance.
Source: IDC, How Much Do We Trust or Not Trust Data?: Key Findings from IDC’s 2022 Data Trust Survey, Doc # US46382820, February 2022
Data sources, formats, and destinations all continue to grow, and those accessing and developing with data want both real-time and historical data to be available. But pipelines built for the batch era aren’t usually reusable or built for developers, and these legacy data pipelines are groaning under the weight of real-time data. As older legacy data systems meet today’s real-time needs, data architectures and pipelines are showing the strain. Many businesses are grappling with a web of point-to-point systems that are hard to scale and maintain, especially in the real-time processing era. We’ve reached a tipping point, where available technologies and capabilities (from SQL to Stream Designer) make it much easier to build in streams.
With streaming pipelines, it’s possible to make data accessible enterprise-wide, which requires:
Delivering access to trustworthy data as a product, so more users with the right access controls can use the data
Simplifying building reusable data flows so people with different skills within the organization can collaborate, iterate and design data flows to meet streaming data needs
Ensuring access to data wherever it sits
The good news? Building better pipelines is possible, and will open up time for higher-level work for teams across IT.
More than ever, time, and how it’s spent, is perhaps a company’s most valuable asset. Business and IT teams today need to be spending their time driving innovation, and not managing open-source tools like Kafka. Focusing on innovation, not infrastructure, is how businesses can get value from data and pull ahead of their competitors.
Building data pipelines is time-consuming and onerous for developers, taking time away from the truly interesting work. “My dream is to have enterprise topics and everything cataloged in Kafka,” said Pritha Mehra, CIO for the United States Postal Service, during her keynote at Current 2022. “Developers can just have a fun day making apps and not worrying about data.”
One thing is clear: there’s a lot to learn and explore in the year ahead, with so much potential for new ideas and ways of working in real time. As Gian Morlino of Imply said on the Current stage: “Streaming vs. batch is as big a shift as mobile phones were. When it comes to streaming, think really big.”
Want to get started with data streaming? Sign up for free with Confluent Cloud.
Who isn’t familiar with Michelin? Whether it’s their extensive product line of tires for nearly every vehicle imaginable (including space shuttles), or the world-renowned Michelin Guide that has determined the standard of excellence for fine dining for over 100 years, you’ve probably heard of them.
At Treehouse Software, when we speak with customers who are planning to modernize their enterprise mainframe systems, there’s a common theme: they are faced with decades of mission-critical and historical legacy mainframe data in disparate databases,