Change data capture (CDC) is a widely used solution to offload data in real time from legacy systems to Kafka in order to make it available to all the other downstream consumer applications. Despite other solutions CDC can in fact guarantee at the same time low latency and a very small footprint on the source system. However when data is moved from a relational database to a distributed stream platform what is gained in terms of throughput and latency is lost in terms of strong consistency and not all consumers are able to manage this loss by themselves. There are different upstream solutions that can be implemented to mitigate this problem preserving different levels of consistency.
In this talk we’ll:
- see what is eventual consistency and where strong consistency is lost while moving data from a database to Kafka
- describe different solutions to preserve consistency working at the source level (i.e. outbox pattern and call back pattern), working on Kafka topology or working on an external storage (i.e. integration hub)
- analyze the pros and cons of all the presented solutions in terms of consistency guarantees and latency loss
Presenter
Matteo Cimini
QuantycaLead Software Engineer with 4+ years of experience in Data Science and Software Engineering.
Experience in designing cloud-based MLOps platforms and large-scale, multi-tiered, event-driven, distributed software solutions to drive real-world actionable business insights.
Presenter
Andrea Gioia
QuantycaAs the CTO at Quantyca, a premier Italian consulting firm in data management, and co-founder of blindata.io, a state-of-the-art SaaS data governance platform, I bring over 15 years of expertise in the dynamic world of data. I've seen it all, from BI and DWH projects to navigating the challenges of big data, AI, and cloud. My passion for technology and data has only flourished along the way. The practice of data engineering has been a bottleneck in the IT industry, but the ongoing revolution in data management, driven by the growing centrality of data in all areas of life, is changing this. I'm eager to contribute to this revolution and excited to see what the future holds for the world of data.