Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now

Introducing Data Portal in Stream Governance

Written By

Today, we’re excited to announce the general availability of Data Portal on Confluent Cloud. Data Portal is built on top of Stream Governance, the industry’s only fully managed data governance suite for Apache Kafka® and data streaming. The developer-friendly, self-service UI provides an easy and curated way to find, understand, and enrich all of your data streams, enabling users across your organization to build and launch streaming applications faster.

Data Portal is a self-service interface for discovering, exploring, and accessing Kafka topics on Confluent Cloud.

Building streaming applications with open source Kafka can be slow and inefficient when there's a lack of visibility into what data exists, where it comes from, and who can grant access. Data Portal leverages the capabilities of Stream Catalog and Stream Lineage to empower data users to interact with their organization’s Kafka data streams safely, efficiently, and collaboratively.

With Data Portal, you can:

  • Search and discover existing topics across the organization with the help of topic metadata and get a drill-down view of the data they hold. 

  • Seamlessly and securely request access to topics through an approval workflow that connects the data user with the data owner, who can approve the request.

  • Set up clients and query data with Apache Flink® to enrich your topics and build new streaming applications and pipelines.

Ready to get started? If you already use Confluent Cloud, you can access Data Portal simply by logging in to your account. You must have a Stream Governance package enabled for the cloud environments you want displayed in Data Portal. Check out the quick start guide to see how you can get Stream Governance up and running in just a few clicks. 

If you’re not yet using Confluent Cloud, you can try Data Portal for free by creating an account and setting up your first cluster. 

To learn more, join us for the Stream Governance webinar with a technical demo that showcases the full capabilities of Data Portal and Stream Governance on Confluent Cloud.

Let’s dive into how to get up and running with Data Portal.

Search and discover existing topics 

When you log in to Confluent Cloud, the Data Portal tab will reflect a unified view of all available Kafka topics by environment. You can search for topics by name or tag, or browse topics by tag, creation date, and modified date. 

Search for topics by name or tag.

Each topic card summarizes the topic with the name, data location (environment, cluster, cloud provider, and region), description, tags, and when it was created or modified.

See a preview of each topic with tags and metadata.

However, this comprehensive view may be overwhelming if you’re sifting through hundreds of topics. Data Portal’s search feature allows you to hone in on the most relevant data for your use case or project. Search for topics by name or tag, or filter by tags, business metadata, cloud provider, region, and other metadata.  

Filter topics by tags, cloud provider, region, or other metadata.

Once you’ve identified a topic that piques your interest, click on the card to learn more about it. Clicking on the card reveals a side panel with additional metadata. In the top section of this panel, you’ll see the location of the topic, its tags, and a description of the data it stores. Below, you can see the schema with its fields (without viewing the actual data), where you can view the structure of the data stored in the topic. You’ll also find a link to the lineage of the topic, information about the owner of the topic, business metadata appended to the topic, and finally, the technical metadata of the topic (created date, retention period, etc).

See a preview of each topic with tags and metadata.

When you click on the Stream Lineage section, you get a complete, end-to-end data flow visualization of the upstream and downstream components from the topic.

Expand the Stream Lineage graph for an end-to-end view of the data streams.

Seamlessly request access to topics 

So you’ve identified the topic(s) you want to access. Now what? Instead of spending valuable cycles pinging various colleagues to find out who can grant access to the data you need, you can use the Data Portal to request access to the topic directly from the Confluent Cloud UI.

Clicking Request access on the topic side panel triggers an approval workflow that connects the user with the data owner via email (if a topic owner email was set on the topic metadata). Select the permissions you require and (optionally) leave a message for the approver describing your request.

Request access to topics with a self-service approval workflow within the Confluent Cloud UI.

Once the request is submitted, the topic owner will receive an email to review the request.

Email notifications let the topic owner know when a request is submitted.

When the topic owner clicks Review request in the email, they are redirected to the Confluent Cloud Access requests UI.  

Data owners and admins can view requests and manage access to topics in a single view.

This simple workflow provides important self-service capabilities for both data owners and data seekers. Streamlining access request management eliminates the need for administrators to manually manage manual permission assignments, while at the same time ensuring that security and access controls remain in place – a welcome sight as your users and topics scale to the tens and hundreds (and beyond)! 

Set up clients and query data with Flink

Let’s assume the data owner granted you access to the topic – now when you click on the topic card, you can view the last message produced and set up a client or query the topic with Flink SQL directly from the UI.

View messages, set up a client, and query data with Flink.

Clicking Set up client will redirect you to the Clients page in the UI, which will walk you through setting up an application in your programming language of choice. 

Clicking Query takes you to the new Flink SQL workspace with a topic query ready to run. 

Process and enrich data streams with Confluent’s new fully managed Flink service.

Check out the latest blog post on our serverless Flink service to learn more about how you can effortlessly filter, join, and enrich your Kafka data streams in-flight. 

What’s next?

We often hear how critical Stream Governance is to our customers’ data in motion journey, and with Data Portal, we’re excited to bring an enhanced user experience to the product. We look forward to expanding our Stream Governance suite further and adding exciting new features in quarters to come. 

Check out Data Portal in Confluent Cloud, today. If you haven’t already, sign up for a free trial of Confluent Cloud and create your first cluster to explore new topics and create streaming pipelines and applications.  

Interested in learning more? Be sure to register for the upcoming Stream Governance webinar, where we’ll share a technical demo that showcases the full capabilities of Data Portal and Stream Governance on Confluent Cloud.

  • Olivia is a product marketer at Confluent, focused on driving the adoption of Confluent Cloud and Confluent Platform and empowering businesses to harness the full potential of data streaming. Before Confluent, she led programs and go-to-market initiatives across infrastructure, storage, and edge technologies at Red Hat.

  • David Araujo is the Director of Product Management for Stream Governance at Confluent. As an engineer turned product manager, over the years he has worked across multiple industries and continents, mostly in the data management and strategy space. He holds a master’s and bachelor’s degree in computer science from the University of Evora in Portugal.

Did you like this blog post? Share it now