Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now
In today's data-driven world, businesses are compelled to expand their data capabilities to cater to the evolving needs of their customers. In addition, with data being produced at an unprecedented rate, businesses are finding it imperative to share this data externally. However, they need to ensure the data is of the highest quality. To keep up with these challenges, businesses are seeking solutions that enable them to share trusted data externally with just a few clicks, without compromising security or reliability. As a result, data integration and management have become crucial components of modern-day business operations. Businesses are exploring new approaches to handle data that offer flexibility, scalability, and ease of use to deliver a seamless data-sharing experience.
Keeping the needs of modern businesses in mind, we are excited to introduce the latest features of Confluent Cloud. These new functionalities enable businesses to connect, process, and share their trusted data faster than ever before.
Here’s an overview of the latest features—read on for more details:
Join us to see the new features in action in the Q2 Launch demo webinar.
As data streaming becomes more ubiquitous and businesses start growing their workloads, teams, and use cases on Apache Kafka, engineering leaders often feel increasing pressure from the complicated and costly cluster scaling, intensifying platform availability and durability risks, and growing unpredictability on end-to-end latency.
Many innovations were born in the past 15 years to address similar growing pains for other modern infrastructures with a cloud-native solution, just S3 to NFS and Snowflake to Teradata. A truly cloud-native service doesn’t just package an open source software on Kubernetes, it takes advantage of cloud infra’s scalability and versatility to deliver a better customer experience while abstracting away all the complexity and burden of managing the cloud.
Therefore, we invested five million hours to craft a truly cloud-native experience for our customers and created Kora, the Apache Kafka engine built for the cloud. With serverless abstraction, automated operations, service decoupling, and global availability, Kora brings GBps+ elastic scaling, guaranteed reliability, and supercharged performance to Confluent Cloud, so that you can:
Scale up and down to handle any workload spike and retention requirement more than 10x faster and easier
Offload Kafka maintenance and operational burdens with 10x more availability, 99.99% uptime SLA, and built-in data durability
Speed up real-time analytics and customer experiences with predictable low latency, sustained across time
Kora still has Apache Kafka at heart and is 100% compatible with Kafka API. It is embedded in Confluent Cloud and is powering 30,000+ Confluent Cloud clusters globally. For more information on what’s under the hood of Kora, check out our announcement blog post.
Data contracts are a formal agreement between upstream and downstream components around the structure and semantics of data that is in motion. One critical component of enforcing data contracts is rules or policies that ensure data streams are high quality, fit for consumption, and resilient to schema evolution over time.
Consider a scenario where a company collects customer data, including a social security number field. Even if the schema of a message is structurally correct, the social security number field may contain an invalid value, which presents a problem for downstream applications that use the social security number field to derive customer identity.
To address this problem, Confluent's Stream Governance suite now includes Data Quality Rules to better enforce data contracts, enabling users to implement customizable rules that ensure data integrity and compatibility and quickly resolve data quality issues. With Data Quality Rules, schemas stored in Schema Registry can be augmented with several types of rules, such as:
Domain Validation Rules: These rules validate the values of individual fields within a message based on a boolean predicate. Domain validation rules can be defined using Google Common Expression Language (CEL), which implements common and simple semantics for expression evaluation.
Event-Condition-Action Rules: These rules trigger follow-up actions upon the success or failure of a Domain Validation rule. Users can execute custom actions based on their specific requirements or leverage predefined actions to return an error to the producer application or send the message to a dead-letter queue.
Complex Schema Migration Rules: These rules simplify schema changes by transforming topic data from one format to another upon consumption. This enables consumer applications to continue reading from the same topic even as schemas change, removing the need to switch over to a new topic with a compatible schema.
Combined with other Stream Governance features, Data Quality Rules allow organizations to deliver trusted data streams to downstream consumers and protect themselves from the significant impacts of poor-quality data.
For more information on Data Quality Rules and Data Contracts, please refer to our documentation.
Every organization has unique data architecture needs, which require building custom connectors or modifying existing ones to integrate home-grown systems, custom applications, and the long tail of less popular data systems with Kafka. However, teams typically have to self-manage these customized connectors and take on non-differentiated infrastructure responsibilities and risks of downtime.
We’re excited to introduce custom connectors that enable you to bring your connectors to the cloud so that you don’t have to manage Connect infrastructure. With custom connectors, you’re able to:
Quickly connect to any data system using your own Kafka Connect plugins without code changes
Ensure high availability and performance using logs and metrics to monitor the health of your connectors and workers
Eliminate the operational burden of provisioning and perpetually managing low-level connector infrastructure
Custom connectors join our portfolio of over 70 pre-built and fully managed connectors on Confluent Cloud to cover all data systems and apps for any streaming use case.
To get started, simply upload each custom plugin once in the console for any user in your org to access it from the connector catalog page and configure and provision connector instances. Custom connectors support custom single message transforms (SMTs) for on-the-fly data transformations, further tailoring the connector to fit your specific use case. After launching the connector, leverage built-in logs and metrics for diagnostics, monitoring, and debugging. You can view both connector and connect worker-level logs in a Kafka topic, accessible through the Kafka API / CLI, the logs page in the console, or a connector like the Elasticsearch Service Sink connector. You can also view connector task status, CPU, and memory usage from the metrics page to understand how your custom connectors are performing.
We share responsibilities with users to ensure that their custom connectors run successfully with high availability. Confluent takes on critical infrastructure activities, including resource provisioning, Connect cluster management, Schema Registry, monitoring, and security. Teams are responsible for providing and troubleshooting the connector plugin, versioning, patching, and overall connector management.
Custom connectors are generally available on AWS in five regions: us-east-1, us-east-2, us-west-2, eu-west-1, and eu-central-1. Learn more about how to write, package, upload, and run a custom connector by checking out our latest tutorial on Confluent Developer.
Businesses in a digital-first world not only need streaming data pipelines to connect data systems internally for informed decision-making, but also need to share real-time data externally with other business units, vendors, partners, and customers. Data sharing is a business necessity, but common methods like flat file sharing were designed for data at rest, and using it to share data in motion results in out-of-sync and stale data as well as scalability challenges and security concerns.
That’s why we’ve built Stream Sharing, the easiest and safest way to share streaming data across organizations. Organizations leveraging Stream Sharing will be able to:
Easily exchange real-time data without delays in a few clicks directly from Confluent to any Kafka client
Safely share and protect your data with robust authenticated sharing, access management, and layered encryption controls
Trust the quality and compatibility of shared data by enforcing consistent schemas across users, teams, and organizations
Stream Sharing enables Confluent Cloud users with the right permissions to share their Kafka topics with any data recipient by simply entering the recipient’s email address. Confluent's Stream Sharing allows sharing of data streams across organizations using open source Kafka Consumer API and retaining all of the robust security controls. Recipients need to create a Confluent Cloud account, and schema-enabled topics require using Confluent Cloud Schema Registry from a Stream Governance package. The service is generally available and provided at no extra cost, with either party able to revoke access at any time.
Following Confluent’s Immerok acquisition earlier this year, the early access program for our fully managed service for Apache Flink has now opened to select Confluent Cloud customers. The program will enable customers to try the service and help shape our product roadmap by partnering with our product and engineering teams.
By bringing Flink to Confluent Cloud, customers can take advantage of Flink's powerful and versatile stream processing framework while offloading its complex day-to-day operations to the world's foremost data streaming experts. Our Flink service will employ the same product principles you’ve come to expect for Kafka:
Cloud-native: Eliminate the operational burden of managing Flink with a fully managed, cloud-native service that is simple, performant, and scalable
Complete: Leverage Flink fully integrated with Confluent’s complete feature set (e.g., Stream Governance, RBAC), enabling developers to build stream processing apps quickly, reliably, and securely
Everywhere: Seamlessly process your data everywhere it resides with a Flink service that spans across the three major cloud providers (AWS, GCP, and Azure)
If you are interested in participating in the Flink Early Access Program, be sure to apply today!
Kafka REST Produce API: The Kafka REST Product API in Confluent Cloud enables developers to produce new messages easily without the need for the Kafka client library. This cloud-native solution uses HTTP to interact with Kafka topics and messages, making it flexible and language-agnostic, and ideal for scaling data streaming workloads.
CLI AsyncAPI with Stream Governance: Confluent Cloud's Stream Governance suite provides tooling for obtaining and importing AsyncAPI specifications into the cloud, enabling developers and architects to programmatically define topics, schemas, tags, and more using the open source standard for event-driven architectures.
HITRUST certification: Confluent Cloud is now HITRUST-certified, which is a “Gold Standard” for the healthcare industry. The HITRUST Common Security Framework (CSF) is a certifiable framework that leverages internationally accepted standards to help healthcare organizations and their providers demonstrate their security and compliance.
Bring-Your-Own-Key (BYOK) encryption: BYOK enables self-managed key encryption for Dedicated Kafka clusters, ensuring data privacy and integrity, now available on Azure, AWS, and Google Cloud.
Static Egress IP addresses: Static Egress IP addresses allow customers to achieve better network security and reliability, through consistent, static IP addresses for egress traffic from their Kafka clusters. To learn more, see Use Static IP addresses on Confluent Cloud and Egress Static IP Addresses for Confluent Cloud Connectors.
Private DNS support: Private DNS support allows customers to simplify on-prem access through the most secure private networking options while avoiding security exceptions or any custom DNS implementation. This support is available on all three major cloud platforms—AWS, Azure, and Google Cloud.
Ready to get started? Remember to register for the Q2 ʼ23 Launch demo webinar on May 31 where you’ll learn firsthand from our product managers how to put these new features to use.
And if you haven’t done so already, sign up for a free trial of Confluent Cloud. New sign-ups receive $400 to spend within Confluent Cloud during their first 30 days. Use the code CL60BLOG
for an additional $60 of free usage.*
The worlds of data integration and data pipelines are changing in ways that are highly reminiscent of the profound changes I witnessed in application and service development over the last twenty years.
It's hard to properly calculate the cost of running Kafka. In part 1 of 4, learn to calculate your Kafka costs based on your infrastructure, networking, and cloud usage.