Ahorra un 25 % (o incluso más) en tus costes de Kafka | Acepta el reto del ahorro con Kafka de Confluent
Change data capture (CDC) is used to copy data across relational databases, enabling essential backend operations like data synchronization, migration, and disaster recovery. And now, with stream processing, you can build CDC pipelines that power event-driven applications and trusted data products, with fresh, processed data integrated across legacy and modern, distributed systems.
See how Confluent brings Apache Kafka® and Apache Flink® together so you can build streaming CDC pipelines and power downstream analytics with fresh, high-quality operational data.
Most organizations already use log-based CDC to turn database changes into events.
Building CDC pipelines with Kafka and Flink lets you unify your CDC workloads and batch analytics and eliminate processing silos. Instead of waiting on batch processing, taking on the costs for redundant processing, or relying on fragile pipelines, this architecture allows you to:
With serverless Apache Flink® on the Confluent data streaming platform, you can shift processing left—before data ingestion—to improve latency, data portability, and cost-effectiveness.
AppDev teams can build data pipelines that unlock timely action
Whether you need shift-left data warehouse and data lake ingestion for analytics, real-time search index building, ML pipelines, and SIEM optimization.
Analytics teams can prep and shape data to feed event-driven applications by triggering computations, state updates, or external actions
This includes applications built for GenAI solutions, fraud detection, real-time alerting and notifications, marketing personalization, and more.
With Confluent, you can process your CDC streams before you materialize them in your analytics estate. Simply filter, join, and enrich change data captured in your Kafka topics with Flink SQ. Then materialize data streams within both your operational and analytics estate.
Confluent customers are using Flink to enhance existing CDC use cases like data synchronization and disaster recovery and unlock new real-time capabilities.
Explore the GitHub repo to learn how to implement real-time analytics for customer 360 and product sales analysis, or sales trend analysis use cases.
You’ll have 2 labs to choose from:
Product Sales and Customer360 Aggregation Lab
Clean, and aggregate product sales data, ingest the enriched data to Snowflake or Redshift, and then create a data product for operational databases to consume.
Daily Sales Trends Lab
Validate payments, analyze sales patterns to identify daily trends, then materialize the Kafka topic as an Iceberg table in Amazon Athena for deeper insights.
“Adopting CDC has allowed us to unleash the power of real-time data and ultimately migrate away from batch data workloads to stream processing.”
“With Flink, we now have the opportunity to shift left and do a lot of early data transformations and computation on our data before it reaches Snowflake. This will optimize our data processing costs to increase the amount of data we have available.”
“With Confluent, we can now easily build the CDC pipelines we need to acquire data in real time rather than retrieving it in batches every 10 minutes, enabling us to detect fraud quickly.”
“The most difficult thing was we didn’t have enough internal resources to develop CDC and the streaming process. Now, we can easily build CDC systems…the developer team was able to decrease their workload while developing the streaming process.”
“With Confluent Cloud, we can now provide operational data in real time to any team that needs it. This is really powerful and significantly reduces our operational burden.”
Ready to start processing CDC data in real time with Flink? Get started on Confluent and implement a stream processing architecture ready for any environment.
Try Confluent Cloud for Apache Flink®—available on AWS, Google Cloud, Microsoft Azure—to build applications leveraging Kafka + Flink with serverless, cloud-native cost efficiency and simplicity.
And with Confluent Platform for Apache Flink®, you can bring your existing Flink workloads to a self-managed data streaming platform, ready to deploy on-premises or in your private cloud.
A streaming approach allows you to "shift left," processing and governing data closer to the source. Instead of running separate, costly ELT jobs in multiple downstream systems, you process the data once in-stream with Flink to create a single, reusable, high-quality data product. This improves data quality, reduces overall processing costs and risks, and gets trustworthy data to your teams faster.
Apache Flink® is the de facto standard for stateful stream processing, designed for high-performance, low-latency workloads—making it ideal for CDC. Its ability to handle stateful computations allows it to accurately interpret streams of inserts, updates, and deletes to maintain a correct, materialized view of data over time. Confluent offers a fully managed, serverless Flink service that removes the operational burden of self-management.
Data consistency is maintained by processing CDC events in-flight to filter duplicates, join streams for enrichment, and aggregate data correctly before it reaches any downstream system. Confluent's platform integrates Flink with Stream Governance, including Schema Registry, to define and enforce universal data standards, ensuring data compatibility, quality, and lineage tracking across your organization.
When your CDC pipeline is integrated with Confluent Schema Registry, it can automatically and safely handle schema evolution. This ensures that changes to the source table structure—like adding or removing columns—do not break downstream applications or data integrity. The platform manages schema compatibility, allowing your data streams to evolve seamlessly.
A fully managed service eliminates the significant operational complexity, steep learning curve, and high in-house support costs associated with self-managing Apache Flink®. With Confluent, you get a serverless experience with elastic scalability, automated updates, and pay-as-you-go pricing, allowing your developers to focus on building applications rather than managing infrastructure. In addition, native integration between Apache Kafka® and Apache Flink® and pre-built connectors allow teams to build and scale fast.
Confluent Cloud provides first-class support for Debezium, an open source distributed platform for change data capture. Pre-built connectors can automatically interpret the complex structure of Debezium CDC event streams, simplifying the process of integrating with Kafka and Flink.