You may be familiar with the primary constructs of Apache Kafka® including topics and partitions. But what is under the hood? The commit-log is the most crucial building block of Kafka. The log is a logical sequence of records composed of segments (files) that deal with storing records and helps replicate data between nodes. Developers will benefit from understanding the internals of Apache Kafka® so they can reason better about topics, partitions, log retention, etc. Understanding the internals of Apache Kafka® is helpful for operators to size the cluster and understand its capabilities.
In this presentation, we are going to deep dive into the internals of Kafka log mechanisms. We will look in detail at the structure of the commit-log and segments, topic partitions arrangement on disk, log retention for compact and delete policies. An attendee will take home knowledge of the commit-log structure and code examples of how to analyse and debug the commit-log.