We’ll beat any hyperscaler on cost. Learn more about our price guarantee. Learn more
Apache Flink aims to make stream processing easy and accessible for everyone. It should come as no surprise that a high level of abstraction puts more load on the core. Flink's SQL engine is the workhorse behind many on-prem and managed SQL platforms. Yet very few users know what is really going on under the hood when submitting a SQL query.
In this talk, we take a deep look into the internals of Flink SQL. Let's take the stack apart! We start with some SQL text and go all the way down to Flink's streaming primitives. I will go through the individual optimizer phases. You will learn how event-time operations are tracked when declaring a watermark, how state is managed when using different kinds of joins, and how changelog modes and upsert keys travel through topology when reading from a Change Data Capture connector.
After this talk, you may not be able to write an optimizer rule, but you should, at least, get a feeling for the power of a simple streaming SQL query.