Register for Apache Kafka®, Confluent, and the Data Mesh

Lessons from building a stream-first metadata platform

For data-driven enterprises, the most important objective is unlocking the value of their data. To enable this, data scientists are increasingly turning towards data discovery tools (also known as data catalogs) that can help them locate the right dataset or insight and use it correctly. But are all data catalogs the same? In this talk, I describe how a stream-first architecture was a critical design element that benefited the implementation of our data catalog. We follow the evolution of LinkedIn DataHub’s architecture over the past few years from a simple search tool to a streaming metadata platform that drives productivity and governance workflows across the company. Join this talk to learn:

  • How different data discovery / catalog tools are architected and the tradeoffs in each kind of architecture
  • How streaming architectures can benefit metadata
  • How event-driven metadata architectures can supercharge your data productivity and governance workflows at your company


Shirshanka Das