KIP-405 introduced tiered storage in Apache Kafka. The proposed design introduces the separation of compute and storage which benefits the brokers to largely focus on serving producer or consume requests and not manage the storage beyond local disks. But the important caveat here is that it should still maintain the same consistency semantics and lineage of data as in the local storage. This talk dives into the internals of tiered storage in how we achieve those semantics covering scenarios like new brokers bootstrapped, or brokers having hard failures, or other out-of-sync brokers becoming leaders etc.
We will also talk about how topic deletion lifecycle management is done without leaking any segments in tiered storage based on the retention policies or while deleting a topic or a partition.