Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now
The rise of fully managed cloud services fundamentally changed the technology landscape and introduced benefits like increased flexibility, accelerated deployment, and reduced downtime. Confluent offers a portfolio of 80+ fully managed connectors that enables quick, easy, and reliable integration of Confluent Cloud with popular data sources and sinks, connecting your entire system in real time. However, with the adoption of cloud-based technologies came opportunities for data security breaches, DDoS attacks, and spam if connections are not secure.
With enterprise architectures becoming more complex by the day due to hybrid and/or multi-cloud environments, as well as the use of multiple vendors for data systems, the path to secure networking isn’t always easy. That’s why a common question we receive from users of our fully managed connectors is how to securely connect to their data sources and sinks. This will depend on several factors: where the data source/sink is located (on-prem vs. cloud), whether the source/sink is from a cloud provider or a 3rd party service (AWS/Azure/GCP vs. MongoDB/Snowflake), and whether the user wants to connect over public or private networks.
In this blog post, we will examine the features that can be leveraged for secure networking on Confluent Cloud in both public and private networking scenarios, with a deep dive into DNS Forwarding and PrivateLink Egress Access Point features that can be leveraged in private networking setups.
Secure public networking setups:
Secure private networking setups:
One of the most common scenarios when using fully managed connectors is connecting to a source/sink using its public endpoint. All data in transit flowing to and from Confluent Cloud is encrypted with Transport Layer Security (TLS). Public egress IPs and the use of gateway/service endpoints can also be utilized dependent on setup.
To provide an additional layer of security when connecting over a public endpoint, we offer public egress IPs for publicly networked Confluent Cloud clusters across all three major cloud providers. With this feature, you can get a list of IP addresses that the fully managed connector will use to connect to your data source or sink. You can then set up a firewall rule to limit access to your sources or sinks only from these IP addresses, drastically reducing the attack surface for your source or sink systems.
Using IP filtering to secure your source/sink systems is relatively easy to set up as many managed data systems today offer the ability to restrict the IP addresses that can access these systems.
To obtain the egress IP addresses, make sure your Kafka cluster is publicly available and configured with the “Internet” networking type then navigate to Cluster Overview > Networking link in the sidebar. Once you are on the networking page, you should see a list of IPs under the “Egress IPs” section on this page. Copy the IP addresses listed there to add to the public networking setup for your source and sink systems so that your fully managed connectors have access to them.
AWS Gateway endpoints and Azure VNet Service endpoints provide direct and secure connectivity to select 1st-party cloud services on AWS and Azure respectively. Connectivity will be established to the service’s public endpoint but all traffic will remain within the cloud provider’s network backbone and never traversing the public internet.
We support the following endpoints with fully managed connectors:
Gateway and Service endpoints are enabled by default on Confluent’s side so there are no additional steps needed to leverage them. You can use these secure endpoints regardless of your Confluent Cloud cluster’s networking type as long as the fully managed connector is running within the same cloud as the target service and set up to connect to the service’s public endpoint.
For additional information refer to applicable documentation about AWS Gateway Endpoints and Azure VNet Service endpoints.
For users who want to connect to their data systems with private endpoints, there are a couple of different options depending on the source/sink system you are connecting to and the networking type for your Confluent Cloud cluster. While setting up private networking is more involved, this will enable you to use fully managed connectors if your company has policies in place to restrict traffic over the public internet due to data security concerns.
There are two common scenarios we’ve seen our customers face when trying to enable private endpoint connections for VPC peering (AWS, GCP)/VNet (Azure) peering or Transit Gateway (AWS) clusters.
The first scenario is where you are directly connecting to the data source or sink’s private IP address or using a fully qualified domain name (FQDN) that is publicly resolvable. In this case, there is no additional setup required beyond the initial peering or Transit Gateway setup and all traffic will transverse over the private network.
The second scenario is when you are connecting to a private endpoint using an FQDN that isn’t publicly resolvable. In this scenario, you can use DNS Forwarding to privately resolve the FQDN across the private network connection. DNS Forwarding enables connectors to resolve FQDNs by forwarding the DNS lookup to a customer’s privately hosted DNS zone or server. You’ll obtain your DNS server’s IPs and associated domain names that you’d like to forward, pass that information into your DNS Forwarding setup, and then set up connectors like normal.
DNS Forwarding is available on AWS and Azure with GCP support coming soon.
Let’s walk through an example of how this looks like with the HTTP source connector. In this example, our HTTP server will be hosted on an AWS EC2 instance with Route53 used as the DNS Server.
To start, ensure that you already have a peering/transit gateway network in Confluent Cloud with at least one active connection then click on the “DNS Forwarding” tab within the network page. Click “Add configuration” and enter the following two fields. Specify the DNS Server IPs (when using Route53, you will specify inbound endpoint IPs) and input the Domains to forward (all associated subdomains will be forwarded). Click “Save” then DNS Forwarding is set up once the status changes from “Provisioning” to “Ready” after a few minutes. From there, navigate to the connectors tab to create the HTTP sink connector as usual.
PrivateLink (AWS PrivateLink, Azure Private Link, GCP Private Service Connect) is a type of secure private networking from each of the cloud providers that allows for unidirectional access to a PrivateLink Service provider (i.e. Confluent Cloud).
To support outbound connectivity from Confluent Cloud, you can leverage Egress Access Points which allows for Confluent’s fully-managed connectors to connect to data sources and sinks using private endpoints. Under the hood, Egress Access Points directly connect to a service provider via AWS VPC Interface Endpoints or Azure Private Endpoints. Each Egress Access Point connects directly to a single service ensuring that our fully-managed connectors only have access to what is required.
Egress Access Points are available on AWS with Dedicated clusters with Azure support coming soon. Dedicated clusters on GCP and Enterprise clusters for all clouds will be supported later this year.
Let’s walk through what this end-to-end setup looks like with the Amazon S3 Sink Connector. To ensure that S3’s private endpoint will be used, make sure to specify the “Store URL” configuration when setting up the connector. You can find the service name to use for AWS services directly from AWS’s documentation, AWS services PrivateLink support.
Start by validating that you already have a Privatelink network created in Confluent Cloud then navigate to the “Egress Access Point” tab on the network page. Name the Access Point and enter the name of your PrivateLink service. Create DNS records if FQDN is not publicly resolvable but this step is not required for S3. When configuring the S3 sink connector, make sure to specify “store.url” config to ensure that the connector is connecting to S3’s private endpoint.
In this blog post, we covered various ways to use fully managed connectors on Confluent Cloud to securely connect to your existing data systems on AWS, Azure, GCP, and those hosted on-premises as well. Setting up public egress IPs or leveraging Service/Gateway endpoints makes it easy to securely connect to data systems when using public endpoints. For customers unable to use public endpoints, DNS Forwarding can be leveraged with VPC/VNet Peering or Transit Gateway while Egress Access Points can be leveraged alongside your PrivateLink setup for private networking options. Confluent is continually working with AWS, Azure, and GCP to provide additional secure networking options in the future.
Ready to get started? If you haven’t done so already, sign up for a free trial of Confluent Cloud to explore new features. New sign-ups receive $400 to spend within Confluent Cloud during their first 30 days. Use the code CL60BLOG
for an additional $60 of free usage.*
This blog announces the general availability of Confluent Platform 7.8 and its latest key features: Confluent Platform for Apache Flink® (GA), mTLS Identity for RBAC Authorization, and more.
We covered so much at Current 2024, from the 138 breakout sessions, lightning talks, and meetups on the expo floor to what happened on the main stage. If you heard any snippets or saw quotes from the Day 2 keynote, then you already know what I told the room: We are all data streaming engineers now.