์‹ค์‹œ๊ฐ„ ์›€์ง์ด๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ๊ฐ€์ ธ๋‹ค ์ค„ ๊ฐ€์น˜, Data in Motion Tour์—์„œ ํ™•์ธํ•˜์„ธ์š”!

Kafka๋ž€?

Apache Kafka๋Š” ์ŠคํŠธ๋ฆผ ์ฒ˜๋ฆฌ, ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ ๋ฐ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ํ†ตํ•ฉ์— ์‚ฌ์šฉ๋˜๋Š” ์˜คํ”ˆ ์†Œ์Šค ๋ถ„์‚ฐ ์ŠคํŠธ๋ฆฌ๋ฐ ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค. 2011๋…„์— LinkedIn์—์„œ ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ํ”ผ๋“œ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์ฒ˜์Œ ๊ฐœ๋ฐœ๋œ Kafka๋Š” ๋ฉ”์‹œ์ง€ ๋Œ€๊ธฐ์—ด์—์„œ ์ดˆ๋‹น 100๋งŒ์—ฌ ๊ฐœ์˜ ๋ฉ”์‹œ์ง€ ๋˜๋Š” ๋งค์ผ ์กฐ ๋‹จ์œ„์˜ ๋ฉ”์‹œ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ์ข…ํ•ฉ ์ด๋ฒคํŠธ ์ŠคํŠธ๋ฆฌ๋ฐ ํ”Œ๋žซํผ์œผ๋กœ ๋น ๋ฅด๊ฒŒ ์ง„ํ™”ํ–ˆ์Šต๋‹ˆ๋‹ค.

์™œ Kafka๋ฅผ ์„ ํƒํ•ด์•ผ ํ• ๊นŒ์š”?

Kafka์—๋Š” ์ˆ˜๋งŽ์€ ์ด์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ค๋Š˜๋‚  Kafka๋Š” ๊ฑฐ์˜ ๋ชจ๋“  ์‚ฐ์—…์—์„œ ํฌ์ถ˜์ง€ ์„ ์ • 100๋Œ€ ๊ธฐ์—… ์ค‘ 80% ์ด์ƒ์ด ํฌ๊ณ  ์ž‘์€ ์ˆ˜๋งŽ์€ ์‚ฌ์šฉ ์‚ฌ๋ก€์— ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐœ๋ฐœ์ž ๋ฐ ์•„ํ‚คํ…ํŠธ๊ฐ€ ์ตœ์‹  ์„ธ๋Œ€์˜ ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆฌ๋ฐ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š” ์‚ฌ์‹ค์ƒ์˜ ํ‘œ์ค€(de facto) ๊ธฐ์ˆ ์ž…๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ž‘์—…์€ ์‹œ์žฅ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์–‘ํ•œ ๊ธฐ์ˆ ๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ Kafka๊ฐ€ ์ด์ฒ˜๋Ÿผ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ์ฃผ๋œ ์ด์œ ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

High Throughput

Capable of handling high-velocity and high-volume data, Kafka can handle millions of messages per second.

High Scalability

Scale Kafka clusters up to a thousand brokers, trillions of messages per day, petabytes of data, hundreds of thousands of partitions. Elastically expand and contract storage and processing.

Low Latency

Can deliver these high volume of messages using a cluster of machines with latencies as low as 2ms

Permanent Storage

Safely, securely store streams of data in a distributed, durable, reliable, fault-tolerant cluster

High Availability

Extend clusters efficiently over availability zones or connect clusters across geographic regions, making Kafka highly available and fault tolerant with no risk of data loss.

How Kafka Works

Apache Kafka consists of a storage layer and a compute layer that combines efficient, real-time data ingestion, streaming data pipelines, and storage across distributed systems. In short, this enables simplified, data streaming between Kafka and external systems, so you can easily manage real-time data and scale within any type of infrastructure.

๋Œ€๊ทœ๋ชจ ์‹ค์‹œ๊ฐ„ ์ฒ˜๋ฆฌ

๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆฌ๋ฐ ํ”Œ๋žซํผ์˜ ํ•ต์‹ฌ์€ ์ƒ์„ฑ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”๋กœ ์ฒ˜๋ฆฌํ•˜๊ณ  ๋ถ„์„ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. Kafka Streams API๋Š” ์ฆ‰๊ฐ์ ์ธ ์ฒ˜๋ฆฌ๋ฅผ ์ง€์›ํ•˜๋Š” ๊ฐ•๋ ฅํ•˜๊ณ  ๊ฐ€๋ฒผ์šด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ, ์ด๋ฅผ ํ†ตํ•ด ์ง‘๊ณ„, ์œˆ๋„์œ™ ๋งค๊ฐœ๋ณ€์ˆ˜ ์ƒ์„ฑ, ์ŠคํŠธ๋ฆผ ๋‚ด ๋ฐ์ดํ„ฐ ๊ฒฐํ•ฉ ์ˆ˜ํ–‰ ๋“ฑ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋ฌด์—‡๋ณด๋‹ค๋„, Kafka ์œ„์— Java ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์œผ๋กœ ๊ตฌ์ถ•๋˜์–ด ์žˆ์–ด ์œ ์ง€ ๊ด€๋ฆฌํ•  ์ถ”๊ฐ€ ํด๋Ÿฌ์Šคํ„ฐ ์—†์ด ์›Œํฌํ”Œ๋กœ์šฐ๋ฅผ ๊ทธ๋Œ€๋กœ ์œ ์ง€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‚ด๊ตฌ์„ฑ์ด ๋›ฐ์–ด๋‚œ ์˜๊ตฌ ์Šคํ† ๋ฆฌ์ง€

๋ถ„์‚ฐ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์—์„œ ์ผ๋ฐ˜์ ์œผ๋กœ ์ฐพ์„ ์ˆ˜ ์žˆ๋Š” ๋ถ„์‚ฐ ์ปค๋ฐ‹ ๋กœ๊ทธ์˜ ์ถ”์ƒํ™”์ธ Apache Kafka๋Š” ๋‚ด๊ตฌ์„ฑ์ด ๋›ฐ์–ด๋‚œ ์Šคํ† ๋ฆฌ์ง€๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. Kafka๋Š” '์ •๋ณด ์†Œ์Šค' ์—ญํ• ์„ ํ•˜์—ฌ, ๋‹จ์ผ ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ๋‚ด ๋˜๋Š” ์—ฌ๋Ÿฌ ๊ฐ€์šฉ์„ฑ ์˜์—ญ ์ „๋ฐ˜์—์„œ ๊ฐ€์šฉ์„ฑ์ด ๋›ฐ์–ด๋‚œ ๋ฐฐํฌ๋ฅผ ์œ„ํ•ด ์—ฌ๋Ÿฌ ๋…ธ๋“œ์— ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฐœํ–‰, ๊ตฌ๋…

์ค‘์‹ฌ์— ์œ„์น˜ํ•œ ์ž‘๊ณ  ๋ณ€๊ฒฝ ๋ถˆ๊ฐ€๋Šฅํ•œ ์ปค๋ฐ‹ ๋กœ๊ทธ๋ฅผ ํ†ตํ•ด ๊ตฌ๋…์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ์›ํ•˜๋Š” ์ˆ˜์˜ ์‹œ์Šคํ…œ ๋˜๋Š” ์‹ค์‹œ๊ฐ„ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐœํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฉ”์‹œ์ง• ํ์™€ ๋‹ฌ๋ฆฌ Kafka๋Š” ๊ณ ๋„๋กœ ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ๋‚ด๊ฒฐํ•จ์„ฑ ๋ถ„์‚ฐ ์‹œ์Šคํ…œ์œผ๋กœ, Uber์—์„œ์˜ ์Šน๊ฐ ๋ฐ ์šด์ „์ž ๋งค์นญ ๊ด€๋ฆฌ, British Gas์˜ ์Šค๋งˆํŠธ ํ™ˆ์„ ์œ„ํ•œ ์‹ค์‹œ๊ฐ„ ๋ถ„์„ ๋ฐ ์˜ˆ์ธก ์œ ์ง€๋ณด์ˆ˜ ์ œ๊ณต, LinkedIn์—์„œ์˜ ์ˆ˜๋งŽ์€ ์‹ค์‹œ๊ฐ„ ์„œ๋น„์Šค ์‹คํ–‰๊ณผ ๊ฐ™์€ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ณ ์œ ํ•œ ์„ฑ๋Šฅ์œผ๋กœ ์ธํ•ด ํ•˜๋‚˜์˜ ์•ฑ์—์„œ ์ „์‚ฌ์  ์šฉ๋„๋กœ ํ™•์žฅํ•˜๊ธฐ์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.

What is Kafka Used For?

Commonly used to build real-time streaming data pipelines and real-time streaming applications, today, there are hundreds of Kafka use cases. Any company that relies on, or works with data can find numerous benefits.

Data Pipelines

In the context of Apache Kafka, a streaming data pipeline means ingesting the data from sources into Kafka as it's created and then streaming that data from Kafka to one or more targets.

Stream Processing

Stream processing includes operations like filters, joins, maps, aggregations, and other transformations which enterprises leverage to power many use-cases. Kafka Streams is a stream processing library built for Apache Kafka enabling enterprises to process data in real-time.Learn more

Streaming Analytics

Kafka provides high throughput event delivery, and when combined with open-source technologies such as Druid can form a powerful Streaming Analytics Manager (SAM). Druid consumes streaming data from Kafka to enable analytical queries. Events are first loaded in Kafka, where they are buffered in Kafka brokers before they are consumed by Druid real-time workers.

Streaming ETL

Real-time ETL with Kafka combines different components and features such as Kafka Connect source and sink connectors to consume and produce data from/to any other database, application, or API, Single Message Transform (SMT) โ€“ an optional Kafka Connect feature, Kafka Streams for continuous data processing in real-time at scale.

Event-Driven Microservices

Apache Kafka is the most popular tool for microservices because it solves many of the issues of microservices orchestration while enabling attributes that microservices aim to achieve, such as scalability, efficiency, and speed. It also facilitates inter-service communication while preserving ultra-low latency and fault tolerance.

Apache Kafka in Action

Kafka๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ธฐ์—…

Airbnb logo
Netflix
Goldman Sachs
Linkedin
Microsoft
New York Times
Intuit

To Maximize Kafka, You Need Confluent

Founded by the original developers of Kafka, Confluent delivers the most complete distribution of Kafka with Confluent, improving Kafka with additional community and commercial features designed to enhance the streaming experience of both operators and developers in production, at massive scale.

You love Apache Kafkaยฎ, but not managing it. Confluent's cloud-native, complete, and fully managed service goes above & beyond Kafka so your best people can focus on what they do best - delivering value to your business.

Cloud Kafka

Cloud-Native

Weโ€™ve re-engineered Kafka to provide a best-in-class cloud experience, for any scale, without the operational overhead of infrastructure management. Confluent offers the only truly cloud-native experience for Kafkaโ€”delivering the serverless, elastic, cost-effective, highly available, and self-serve experience that developers expect.

Complete Kafka

Complete

Creating and maintaining real-time applications requires more than just open source software and access to scalable cloud infrastructure. Confluent makes Kafka enterprise ready and provides customers with the complete set of tools they need to build apps quickly, reliably, and securely. Our fully managed features come ready out of the box, for every use case from POC to production.

Kafka Everywhere

Everywhere

Distributed, complex data architectures can deliver the scale, reliability, and performance that unlocks use cases previously unthinkable, but they're incredibly complex to run. Confluent's complete, multi-cloud data streaming platform makes it easy to get data in and out of Kafka Connect, manage the structure of data using Confluent Schema Registry, and process it in real time using ksqlDB. Confluent meets our customers everywhere they need to be โ€” powering and uniting real-time data across regions, clouds, and on-premises environments.

์ง€๊ธˆ ๋ฐ”๋กœ ์ฒดํ—˜ํ•ด ๋ณด์„ธ์š”

Confluent๋Š” ๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ์™€ ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹จ์ผ ์ •๋ณด ์†Œ์Šค์— ํ†ตํ•ฉํ•จ์œผ๋กœ์จ ์™„์ „ํžˆ ์ƒˆ๋กœ์šด ๋ฒ”์ฃผ์˜ ํ˜„๋Œ€์ ์ธ ์ด๋ฒคํŠธ ๊ธฐ๋ฐ˜ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์‰ฝ๊ฒŒ ๊ตฌ์ถ•ํ•˜๊ณ  ๋ฒ”์šฉ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ์„ ํ™•๋ณดํ•˜๋ฉฐ ์™„๋ฒฝํ•œ ํ™•์žฅ์„ฑ, ๋ณด์•ˆ ๋ฐ ์„ฑ๋Šฅ์„ ๊ฐ–์ถ˜ ๊ฐ•๋ ฅํ•˜๊ณ  ์ƒˆ๋กœ์šด ์ด์šฉ ์‚ฌ๋ก€๋ฅผ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

์‹ ๊ทœ ๊ณ„์ • ์ƒ์„ฑ ํ›„ 4๊ฐœ์›” ๋™์•ˆ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” 400๋‹ฌ๋Ÿฌ ์ƒ๋‹น์˜ ๋ฌด๋ฃŒ ํฌ๋ ˆ๋”ง์œผ๋กœ ์ง€๊ธˆ ๋ฐ”๋กœ ๋ฌด๋ฃŒ๋กœ ์ฒดํ—˜ํ•ด ๋ณด์„ธ์š”. ๋ณ„๋„์˜ ๊ฒฐ์ œ๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

Apache Kafka๋Š” ์‹œ์ž‘ํ•˜๊ธฐ ์‰ฝ๊ณ  4๊ฐ€์ง€ API(Producer, Consumer, Streams ๋ฐ Connect)๊ฐ€ ํฌํ•จ๋œ ๊ฐ•๋ ฅํ•œ ์ด๋ฒคํŠธ ์ŠคํŠธ๋ฆฌ๋ฐ ํ”Œ๋žซํผ์„ ์ œ๊ณตํ•œ๋‹ค๋Š” ์ด์ ์œผ๋กœ ์ธํ•ด ๊ฐœ๋ฐœ์ž์—๊ฒŒ ์ธ๊ธฐ ์žˆ๋Š” ๋„๊ตฌ์ž…๋‹ˆ๋‹ค.

๊ฐœ๋ฐœ์ž๋Š” ์ฃผ๋กœ ๋‹จ์ผ ์ด์šฉ ์‚ฌ๋ก€๋กœ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, Apache Kafka๋ฅผ ๋ฉ”์‹œ์ง€ ๋ฒ„ํผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ์˜ค๋Š˜๋‚ ์˜ ์›Œํฌ๋กœ๋“œ๋ฅผ ๋”ฐ๋ผ๊ฐˆ ์ˆ˜ ์—†๋Š” ๋ ˆ๊ฑฐ์‹œ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋ฅผ ๋ณดํ˜ธํ•˜๊ฑฐ๋‚˜, Connect API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด๋‹น ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋ฅผ ํ•จ๊ป˜ ์ œ๊ณต๋˜๋Š” ๊ฒ€์ƒ‰ ์ธ๋ฑ์‹ฑ ์—”์ง„๊ณผ ๋™๊ธฐํ™”๋œ ์ƒํƒœ๋กœ ์œ ์ง€ํ•จ์œผ๋กœ์จ Streams API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ๊ฐ€ ๋„์ฐฉํ•˜๋Š” ์ฆ‰์‹œ ์ฒ˜๋ฆฌํ•˜์—ฌ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ๋ฐ”๋กœ ์ง‘๊ณ„๋ฅผ ํ‘œ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ„๋‹จํžˆ ๋งํ•ด Apache Kafka์™€ ํ•ด๋‹น API๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ์•ฑ์„ ๊ตฌ์ถ•ํ•˜๊ณ  ๋ณต์žกํ•œ ๋ฐฑ์—”๋“œ ์‹œ์Šคํ…œ์„ ๊ฐ„๋‹จํ•˜๊ฒŒ ๊ด€๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Kafka๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฐ์ดํ„ฐ๊ฐ€ ํ•ญ์ƒ ๋‚ด๊ฒฐํ•จ์„ฑ์ด ์žˆ๊ณ  ์žฌ์ƒ ๊ฐ€๋Šฅํ•˜๋ฉฐ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ œ๊ณต๋œ๋‹ค๋Š” ํ™•์‹ ์œผ๋กœ ์•ˆ์‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ๋กœ ์•ฑ๊ณผ ์‹œ์Šคํ…œ์„ ์ฒ˜๋ฆฌ, ์ €์žฅ ๋ฐ ์—ฐ๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ๋‹จ์ผ ์ด๋ฒคํŠธ ์ŠคํŠธ๋ฆฌ๋ฐ ํ”Œ๋žซํผ์„ ์ œ๊ณตํ•˜์—ฌ ์‹ ์†ํ•˜๊ฒŒ ๊ตฌ์ถ•ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.