Are you a Kafka Streams application developer who needs a faster, more efficient way to reproduce a bug or issue from past events? Do you need to test a new algorithm patch near a discrete point-in-time? Do you have a standard methodology for investigating these issues, or does each team member devise an ad hoc solution? How much time does your simulation take?
It is technically impossible to solve this challenge using only changelog topics, since a compact, topic-based changelog doesn't capture the full change history. The Derivatives Data team at Bloomberg augments this through periodic snapshots. To make the snapshotted state accessible to our replay system, we built a query service that leverages Interactive Queries using the Kafka Streams API, along with gRPC-based coordination, to fetch the distributed snapshot states from different snapshotting instances.
In this talk, we will cover:
• Overview of this system architecture
• Deep dive into the mechanics of snapshots with Kafka Streams, state-store changelogs, and a query service to serve replay requests
• How two modes, normal and replay, are used on the Kafka Streams application runtime
• Some use cases that benefit from this replay system