국내 No.1 에너지 IT기업 ‘해줌’의 컨플루언트 클라우드 도입 스토리 | 알아보고 등록하기

Presentation

데이터 엔지니어를 위한 스트리밍 SQL: 차세대 혁신?

« Current 2022

SQL is the lingua franca of data analysis, but should we use it more as data engineers?

Modern tools like dbt make it easier to express transformations in SQL, but streaming is more complicated than batch. Streaming pipelines usually require higher SLAs and many CI/CD and observability practices, so data engineers prefer to use familiar languages like Python, Java and Scala along with many useful frameworks and libraries. Can SQL replace that?

I was very skeptical when I first heard the idea of using SQL for writing somewhat complex stream-processing data application a few years ago. How do you unit test it? How do you version it?

Over the years, Spark SQL streaming, Flink SQL, ksqlDB and similar tools have matured, now they easily support complex stateful transformations. However, developer experience is still questionable: it’s easy to write a SQL statement, but how do you maintain it over the years as a long-running application?

In this presentation, I hope to share the discoveries I made over the years in this area, as well as working practices and patterns I’ve seen.

Related Links

How Confluent Completes Apache Kafka eBook

Leverage a cloud-native service 10x better than Apache Kafka

Confluent Developer Center

Spend less on Kafka with Confluent, come see how