Découvrez de nombreuses nouveautés à l'occasion de notre lancement du 2e trimestre 2023 : moteur Kora Engine, règles de qualité des données, et bien plus encore | S'inscrire pour une démo
Historically, Pinterest data warehouse ingestion and indexing services were implemented on batch ETL and Kafka streaming respectively. As the product side leans more toward real-time and near-realtime data to innovate and compete, teams work together to revamp the ingestion and processing stack in Pinterest.
In this talk, we plan to share our near-real-time ingestion system built on top of Apache Kafka, Apache Flink, and Apache Iceberg. We pick ANSI SQL as the common currency to minimize the ""lambda architecture"" learning curve of teams adopting fresh data near-realtime data.