Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now

Data Governance

In today’s fast-paced world, every business is becoming heavily reliant on data. But how do you manage data and ensure data access while maintaining data integrity, security, and compliance? With data growing in speed, velocity, and complexity, data governance is now a critical part of every modern organization.

What is data governance?

Data governance consists of processes, policies, an organization implements to ensure safe and proper use of its business data. The integrity of that data, its availability to authorized users, and its protection from misuse are among the primary goals of data governance.

How it works:

Some descriptions of data governance include the operational aspects of managing enterprise data, including activities that are often assigned to a data steward (or a data owner). Such an oversight role is necessary for enforcing consistent policy across the organization and achieving better business outcomes. A data steward typically contributes to policy development and enablement for a specific department or division.

Why Data Governance?

Data governance is essential to coordinating compliance with industry standards, such as protecting personally-identifiable information (PII), and governmental regulations such as the EU's General Data Protection Regulation (GDPR) and California's Consumer Protection Act (CCPA). Communicating these requirements and meeting these strategic objectives typically includes internal legal review, third-party auditing, and other functions that fall outside the scope of a data steward.

Good data governance policies must also be flexible enough for the business to negotiate exceptions to compliance needed to update mission-critical business practices without costly disruptions or undue hardship. Without the executive authority to advocate for more time, conflicts could trigger delays, penalties for non-compliance, and even the loss of important accounts.

Benefits of data governance

A data governance strategy reqires a complete overview of how an organization will manage data, as well as protect and improve business outcomes. Here's what to prioritize, and what to expect as an outcome:

Industry and regulatory compliance

Protecting the customer experience through data privacy and the business’s reputation through data security are paramount concerns. Without them, the business is vulnerable to financial liability and stakeholder concern for its health and value. Good governance ensures not only that the business is doing the right work and doing it well, but that stakeholders and observers hear about it and maintain their goodwill and confidence.

Reliable, accessible, high-quality data

Data governance strategy relates the need for reliable, accessible, high-quality data to the business’s desired outcomes. It does so by publishing how a company’s key data sources are documented, how that data is sourced or produced, where and how it is stored, when and how it must be deleted, and when and how it may be shared. Good practice in this domain includes eliminating redundant data sources, reducing data bloat, and archiving seldom-used data to reduce the total cost of ownership.

Reliable, data-driven decision-making

Tracing the lineage of data is a key concern. When a data product is demonstrably flawed, it can be prohibitively expensive to isolate the cause. Good data governance should include such tracing for important data sources to ensure that the products of every mission-critical project are correct, reliable, and reproducible. These requirements for data are justified when the value of ensuring accurate data outweighs the expense of supporting procedures to isolate issues.

Consistent data practices across business units

Strictly speaking, operational excellence is not considered a primary objective of data governance. The cost of attaining a standard for data governance must fall below the value it yields for the business. This tradeoff can vary dramatically from one industry to another, as well as across different business units or silos of the same company.

Data governance policy should focus on maintaining this tradeoff. It should promote practices that can be applied to the context of each business unit.

Reliable, consistent security policy

Enterprise security is inherently multi-faceted and difficult to implement fully. Data governance specifies who manages key business data, who grants access to it, and who makes it available for auditing or another external review. When important content is stored with a cloud provider, governance documents the division of responsibility for that content’s availability and accessibility.

Confluent offers integrated solutions that make it easier to implement your business’ policies, including support for single sign-on (SSO); RBAC; encryption at-rest and in-transit; auditing logs; and other solutions.

Common data governance practices

Data governance is strategic and advisory in nature. While there’s no rule against mandating specific tools or technologies, a more flexible approach would allow each business unit to show how it adheres to company standards in the course of producing its outcomes. There are quite a few considerations when implementing data governance.

Implementing data governance and things to consider:

  • Identify organizational data and how it should be used
  • Determine who owns and manages key data sets
  • Create standards and processes for accessing, managing, and storing company data
  • Define policies and processes applicable to all business units
  • Support for auditing and maintaining security standards
  • Create standard terminology that all business lines should use
  • Define the metrics used to evaluate the impact of data governance

Expert advice and guidance is key to implementing an effective data governance plan. This guidance could include tool and technology recommendations that will keep the business on par with its competitors.

Keys to a successful governance strategy

While every strategy is wholly different, productive data governance strategy might include some of the following steps:

Steps to an effective data governance strategy

  • Maintain executive oversight, which is critical to ensuring adoption across business lines
  • Adopt asset management practices to inventory key business data
  • Assign data domains to company leaders
  • Identify high-value use cases and the data needed to implement them
  • Define the means for monitoring and reporting on how data is secured, used, shared, stored, deleted, and audited
  • Establish metrics to quantify and assess the effectiveness of data governance

Getting started with proper data governance

Adopting new practices is often a straightforward process. The difficult part is estimating the value of adoption against the costs so that leadership can justify the investment. The steps below outline a strategy aimed at helping decision-makers see this value.

  • Assemble your governance team and socialize their roles.
  • Identify one use case with significant potential value and develop it.
  • Set realistic, measurable goals for demonstrating attained value.
  • Understand your stakeholders’ expectations and speak to them.
  • Choose technology that can handle multiple, differing use cases.

Challenges and Considerations

“Most governance programs today are ineffective. The issue frequently starts at the top, with a C-suite that doesn’t recognize the value-creation potential in data governance.”

McKinsey

Leadership support:

Funding data governance projects alone isn’t enough. Without active leadership involvement, promoting adoption beyond a few use cases will be very difficult.

Cross-functional alignment:

Aligning numerous, distributed teams across your company, from finance and legal, to IT and marketing requires a top-down effort.

Data silos vs data as a product:

If your company prefers data silos that are guarded by bespoke departmental policies, it will be difficult to apply the value of a department’s data to other business units. Encouraging these data owners to act as stewards starts with promoting data as both an asset and a product. To drive this perspective, you need a vision with a clear message that reaches across the company.

Deferring Security and Compliance:

Most transformative projects stall when implementing security after other core practices and processes have been established. We know from decades of experience that waiting for “the right time” leads to prohibitive costs down the road, costs that can reduce the value proposition of new use cases. It can even make the expense of compliance mandates so high that new and repeated calls for exceptions and exemptions become an everyday occurrence.

Real-time data governance with Confluent

Confluent offers a variety of insights on applying real-time data streaming various industries, including:

Confluent’s Stream Governance – Real-Time Data Governance in an Event-driven world

Legacy solutions on the market today are designed with storage-centric, batch-oriented workloads in mind. As a result, they can’t address the needs of data governance for guiding robust streaming data and event-driven architectures. Confluent’s Stream Governance offering is specifically designed for this use case.

With Stream Governance, companies can bring together current and historical business data to create and manage event-driven, real-time solutions. Bringing data together across the organization, keeping it in motion, and unlocking its value requires the right tools for data stewards and governors alike to visualize and communicate these changes.

Start Confluent for Free

Confluent’s Stream Governance supports three key capabilities for data streams: lineage for tracing, a catalog service for discovering and reusing streams, and a data quality service for maintaining data integrity. The table below offers a bit more detail.

Within Stream Governance, this view of a lineage map traces the source of Kafka topic content
Within Stream Governance, this view of a lineage map traces the source of Kafka topic content

How Confluent works

Data as a strategic asset is quickly becoming the standard for data-driven businesses. How will you put your data in motion ,and protect it at the same time?

Stream Lineage

Insight into complex data relationships and interactive maps of event streams so you can uncover deeper governance insights

Stream Catalog

Self-service data discovery so you can search, classify, and organize event streams for higher productivity and increased collaboration

Schema Registry

Real-time data integrity even as you deliver event streams to the business and scale your data in motion