Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now
Today’s modern businesses leverage Apache Kafka® to build event-driven architectures that connect data systems in real time and deliver on customer expectations of immediacy. However, developing the underlying streaming data pipelines come with time and resource constraints, particularly when working with Kafka Connect to build connectors for each source and sink system across your tech stack.
Developers often turn to the ecosystem of connectors built by Confluent, the Kafka community, or partners to save time and engineering resources they would have otherwise needed to build and test each connector themselves. Downloading the connector .jar file may seem easy, but the real challenge comes from managing the inner workings of both the Connect workers and the connector plug-ins running on top. The responsibilities of configuring, deploying, scaling, and ensuring no downtime for the connector now falls on your plate—and just like with Kafka, you could find yourself preoccupied with Connect-related issues and operations.
In other words, self-managing connectors isn’t free. It demands infrastructure costs for hosting the Connect workers, valuable engineering and DevOps time for scaling and maintenance, and responsibility for downtime and business disruption. These are time and resources that could otherwise be focused on more strategic projects rather than taking on the unnecessary operational burdens and risks of non-differentiating, data integration activities.
That’s why we’ve invested heavily in building out our fully managed connector portfolio for Confluent Cloud—adding new connectors, networking options, productivity features, and streamlined configurations. Our 70+ fully managed connectors provide the speed, simplicity, and reliability you need when streaming data in and out of Kafka. They take just a few clicks to configure through the Confluent Cloud UI or CLI, then you’re off to the races without the ongoing ops burden.
Unlike other cloud-hosted Kafka services, Confluent Cloud connectors are fully managed across the entire stack so that you can leave lower-level infrastructure activities to the world’s foremost Kafka experts. Free yourself and your team to move up the stack and build streaming pipeline use cases and applications. You can also use Stream Designer, a visual pipeline builder, to make pipeline development even more seamless.
Let’s take a closer look at the benefits of using fully managed connectors, hear from a customer, and walk through a demo of how to configure and launch them through Stream Designer.
Our fully managed connectors help you breeze through real-time data integration and remove the ongoing operational burdens associated with self-managing connectors. It’s the fastest and easiest way to break data silos and stream your data across the organization. Confluent does the heavy lifting by taking on critical Connect activities like managing internal topics, worker configs, monitoring, and security. Just provide the connector configurations and everything runs for you, eliminating the need for hands-on management and maintenance post-launch. As your organization’s requirements and data stack evolve over time, you can easily add or replace connectors to keep your pipelines up to date.
Confluent offers the largest portfolio of fully managed Kafka connectors in the market, including connectors for popular systems like Amazon S3, Azure Blob Storage, Databricks Delta Lake, Google BigQuery, MongoDB Atlas, Snowflake, and more! Select the ones you need, and we’ll guide you through each step of the configuration and recommend default properties to speed you through to launch.
Confluent’s fully managed connectors also come with built-in productivity features like single message transforms (SMTs), exposed connect logs, and data preview. SMTs enable you to perform lightweight data transformations like masking and filtering in flight within the connector itself and log events provide contextual information to simplify debugging and troubleshooting. Data preview, uniquely available with Confluent Cloud connectors, lets you test a source connector’s output prior to launching the connector. This helps with iterative testing so that you can confidently launch connectors into production.
If you’re looking to build streaming data pipelines quickly and efficiently, Confluent Cloud not only offloads your connector management but also provides Stream Designer, the visual builder for building streaming data pipelines. Stream Designer provides a point-and-click, graphical canvas that speeds you through building, testing, and deploying production-ready pipelines in minutes. Instead of configuring each component of the pipeline separately, you can visualize what you’re building as you build it, all within a unified, end-to-end view. Stream Designer also simplifies stream processing, a critical component of intelligent, streaming pipelines, by representing common ksqlDB operations like filter, join, and aggregate as pre-built blocks you can easily configure.
Stream Designer supports our entire portfolio of fully managed connectors along with their SMT capabilities. Pipelines on Stream Designer are built natively on Kafka, with pipeline definitions translating to ksqlDB code (i.e., SQL queries) under the hood. This gives you the flexibility to edit the SQL directly or import/export pipelines as code for easy transfer across environments. If you’re thinking about migrating from self-managed connectors to our fully managed ones, using Stream Designer to quickly launch connectors and build out the rest of your pipelines can make that process seamless.
Let’s look at how easy it is to launch fully managed connectors on Stream Designer.
This demo covers a simple but common analytical pipeline scenario where you want to share the latest changes from Salesforce with Google BigQuery for further analysis. We’ll build this out using Stream Designer and our fully managed Salesforce Pushtopic source and Google BigQuery sink connectors.
Start by navigating to the Stream Designer page from the left menu in Confluent Cloud. Click “create a new pipeline” and select a ksqlDB cluster to use with our pipeline. We have a few different starting point options for creating a pipeline in Stream Designer, but for this demo, we’ll start by creating a source connector.
To get started, click on the connector tile to open the configuration panel. You can search for any of our fully managed connectors and in this case, we’ll select the Salesforce PushTopic source connector, which allows us to subscribe to create, update, delete, and undelete events that occur with Salesforce Objects.
Next, provide a name for the Kafka topic to write the data to. We’ll use the name “accounts” because we are interested in changes to the account object. Then, fill out the following tabs with your Kafka and Salesforce credentials. On the configuration tab, choose the output format of your choice, the Salesforce Object, and the Salesforce Pushtopic to subscribe to. If you’d like to add any SMTs, drop down to the “advance configuration” fields to do so. Finally, input the number of tasks and we’ve wrapped up configurations!
At this point, the connector hasn’t been provisioned yet because we haven’t activated the pipeline. So let’s hover over the connector node and add a Kafka topic. We’ll give the topic the same name that we’ve provided in the connector configurations (“accounts”), then activate the pipeline to launch the connector and create the topic.
You can check the status of each of the pipeline components at the bottom of the block. Click on the “accounts” topic to see the messages flowing in from Salesforce.
Now, let’s add some stream processing to the pipeline. You can choose from a number of common ksqlDB operations like filter, join, and group by, but in our case, we’ll just add a filter to the Salesforce object type to filter for accounts. Make sure to add an output topic to write the results of the filter to.
As the last step of the pipeline, we’ll add a Google BigQuery sink connector to send the accounts to our data warehouse. The steps to configuring a sink connector are the same as for the source connector. Select a topic to read from, provide your Kafka and GCP credentials, task size, and save the configurations. Now, we’ll need to re-activate the pipeline to launch the sink connector and get data flowing through to BigQuery.
Finally, let’s step into our BigQuery project to verify that new records are coming in. There are 25 records of account updates. If we go back to Salesforce and make a change to an account, we can validate that our pipeline is working by re-running the count query and voila—we now have 26 records, reflecting the latest change in Salesforce.
Self-managing connectors come with major time and resource challenges and taking on unnecessary risks of downtime that shift your team’s focus away from working on more strategic projects and innovations. Confluent’s fully managed connectors alleviate these burdens as the fastest and easiest way to stream your data into and out of Kafka. Leveraging fully managed connectors also unlocks access to Stream Designer to further boost your overall pipeline development velocity.
What are you waiting for? Try out our fully managed connectors today!
Data pipelines are critical to the modern, data-driven business, connecting a network of data systems and applications to power both operational and analytical use cases. With the need to promptly […]
As data flows in and out of your Confluent Cloud clusters, it’s imperative to monitor their behavior. Bring Your Own Monitoring (BYOM) means you can configure an application performance monitoring […]