Change Data Capture (CDC) is a popular technique for extracting data from databases in realtime. However, many CDC deployments are static: e.g. a single connector is configured to ingest data for one or several tables.
At Goldsky, we needed a way to configure CDC for a large Postgres database dynamically: the list of tables to ingest is driven by customer-facing features and is constantly changing.
We started using Flink CDC connectors built on top of the Debezium project, but we immediately faced many challenges caused mainly by the lack of incremental snapshotting.
But even after implementing incremental snapshotting ourselves, we still faced an issue around using replication slots in Postgres: we couldn't use a single connector to ingest all tables (it's just too much data), and we couldn't create a new connector for every new set of tables (we'd quickly run out of replication slots). So we needed to find a way to maintain a fixed number of replication slots for a dynamic list of tables.
In the end, we chose a consistent hashing algorithm to distribute the list of tables across multiple Flink jobs. The jobs also required some customizations to support the incremental snapshotting semantics from Flink CDC.
We learned a lot about Debezium, Flink CDC and Postgres replication, and we're excited to share our learnings with the community!