Live Demo: Build Scalable Event-Driven Microservices with Confluent | Register Now

Notion Logo

Moving at the Scale of Notion: How Confluent Powers 100M+ Users Daily

Notion realized huge time savings and tripled its productivity with Confluent

Notion’s event logging system processes the real-time messages of over 100 million users

Data streaming helped Notion overcome scaling limitations, optimize costs, and unlock new product capabilities

How Notion has transformed its AI capabilities with the Confluent data streaming platform

Watch the Video
The Notion Story: Creative Collaborations

Notion is the connected workspace for your docs, notes, projects, and knowledge—with AI integrated throughout. It allows teams to write, plan, organize, and collaborate, all in one place. Built-in AI helps users generate new content and answers questions about information across their workspace and connected apps like Slack, Jira, and Google Drive.

Notion helps people manage everything from simple tasks like event planning to more complex initiatives like managing projects, or even running whole organizations. Notion users can take all projects and personalize them to how their teams work. 

The Notion engineering team aims to deliver transformative user experiences more efficiently by providing platforms and frameworks that enable the entire Notion team to unlock new product capabilities.

The Challenge: Moving at the Scale of Notion

“Notion grew by millions of members in 2021, driven by increasing demand for remote collaboration tools during the pandemic. But we realized that our legacy messaging architecture that powered event logging wasn’t going to scale with this kind of rapid growth.,” said Ekanth Sethuramalingam, Engineering Lead at Notion. “At the same time, we needed to optimize costs to ensure long-term sustainability.” 

With the goal of building better product features faster, the engineering team needed a data solution that could enable more real-time, online indexing for some of its core product use cases. But without a data lake and an event-driven platform, scaling was impossible.

The team found that some of the off-the-shelf data tools were too expensive and restrictive to meet their needs. Their workflow involved sending events to third party tools with custom connectors that would route to a few different places—an unsustainable model that was too expensive and difficult to maintain at scale. 

While Notion was growing rapidly, the team recognized an opportunity to upgrade its real-time analytics, AI capabilities, and technology integrations to better support its users. They wanted a scalable, high-performing architecture that could deliver insights in real time—without requiring the team to divert focus from building the core product experience. With a lean engineering team focused on innovation, streamlining infrastructure and operations became key to accelerating product development and delivering even more value to users.

By moving to the Confluent data streaming platform, they gained more control over those events without the operational overhead. “We wanted to be able to transform the data into the shape we needed by building our own connectors to other locations that weren’t supported by our existing tools,” says Adam Hudson, Senior Software Engineer. 

“The Confluent data streaming platform enables us to move at the scale of Notion, meaning that it can process the real-time messaging of over 100 million users generating many, many pages per day. As a result, we’re often approached by other teams within Notion who see the work we’re doing and ask, how can we get on board with this?” says Hudson. 

The Solution: Managing Infrastructure vs. Driving Innovation

The goal became clear: make Notion’s data move faster by developing a more scalable, event-driven architecture to support critical product features like real-time analytics, Notion AI search and generation, and seamless integrations. With a lean, high-impact engineering team focused on delivering differentiated value, managing complex infrastructure in-house wasn’t the right tradeoff. “We believe in solving problems that are unique and differentiated for Notion,” said Sethuramalingam. “But having our team managing infrastructure was out of the question. We needed something managed.”

Enter Confluent, the data streaming platform that enabled Notion to transition to a fully managed, event-driven architecture built on Apache Kafka®. Confluent Cloud gave the team the flexibility to scale clusters up and down effortlessly, increase storage retention on demand, and eliminate the burden of managing Kafka—an operational feature that keeps costs in check. 

Confluent’s ability to merge seamlessly with Amazon Web Services (AWS) was also a huge plus. Since Notion was already on AWS, the fact that Confluent offered the best managed Kafka solution already natively integrated with AWS was a robust feature. “With its powerful, fully managed connectors, we effortlessly stream data into destinations like Amazon S3, making real-time event logging scalable and efficient,” said Sethuramalingam.

Another big advantage with Confluent was its pre-built connectors for Snowflake, Amazon S3, PostgreSQL, and other critical data systems, which allowed Notion to seamlessly move data where it was needed—whether for real-time processing, analytics, or AI applications. “Confluent has a vast library of pre-built connectors that allow us to quickly prototype before deciding on a final path,” explained Sethuramalingam. “That flexibility means we can experiment faster and bring new features to market more efficiently.” In addition, features like Schema Registry and stream processing enabled Notion to capture, enrich, and analyze user events in real time without worrying about the underlying infrastructure.

With the Confluent data streaming platform, the team found a solution that helps them stream, process, and govern their data effortlessly. The solution eliminates operational overhead, provides built-in scalability, and enables them to focus on innovation rather than infrastructure management.

A Reliable, Scalable Backbone for Notion AI 

With Notion, AI powers two core functionalities that the team refers to as "the find" and "the do." "The find" enables users to search for content and get precise answers through retrieval-augmented generation (RAG), leveraging all the information within their own workspace and all the apps connected to it. "The do" focuses on content generation—using large language models (LLMs) to create new content based on existing workspace data or even insights. These AI features empower users to work smarter and faster, but they require a real-time, scalable infrastructure to function seamlessly.

With the Confluent data streaming platform as the backbone of Notion AI, Notion can ensure that changes made within the application are instantly reflected in its vector database for RAG—real-time synchronization that is crucial for keeping Notion’s search and content generation capabilities up to date. "Confluent's platform allows us to stream changes as they happen, ensuring that our AI tools always provide the most relevant and timely information," says Sethuramalingam. 

Transformative Results in Data Streaming

Notion’s adoption of the Confluent data streaming platform has been nothing short of transformative. In just over a year, the engineering team has migrated to an event-driven architecture that has revolutionized how they build and deliver new features. This shift has dramatically accelerated innovation, reducing time to market and enabling them to outpace work tools like Slack. 

With Confluent powering its real-time data processing, Notion has built a scalable, flexible data infrastructure that supports a wide range of mission-critical use cases, including:

  • Real-Time Analytics and AI: Notion now processes and enriches data instantly, enabling advanced features like generative, AI-powered Autofill.

  • Seamless Product Integrations: Real-time notifications and integrations, such as Slack, run smoothly thanks to Confluent’s data streaming infrastructure.

  • Effortless Data Management: Events are processed, stored, and analyzed in Snowflake and Amazon S3, enabling smarter decision-making and analytics.

  • Faster Innovation: With infrastructure management and Kafka maintenance offloaded to Confluent, Notion’s data platform team can now dedicate 100% of their time to bringing new features to market faster.

  • Time Savings: Notion realized huge time savings and tripled its productivity with Confluent.

With a powerful, cost-efficient, and flexible data streaming platform in place, Notion is poised to keep innovating—delivering new AI-driven product features and expanding its use of generative and ambient AI with Confluent. 

Notion is especially excited about what’s ahead, as Confluent’s latest innovations continue to remove complexity and unlock more value. Emerging capabilities like autoscaling, freight clusters, and Tableflow—along with deeper integration of Flink SQL—make it even easier to tap into real-time data without worrying about infrastructure. Freight clusters, for example, store data directly in S3 and scale automatically as needed, giving Notion a more cloud-native, cost-efficient architecture. This allows teams to  shift focus from operational overhead to high-value innovation, accelerating their ability to deliver differentiated, AI-powered experiences to users.

By leveraging Confluent’s technology on AWS infrastructure, Notion achieved the flexibility, scalability, and cost efficiency to support its growth. “With just the click of a button, we can add more capacity and continue to scale,” Sethuramalingam noted. “We love the fact that Confluent exists. The platform offers us a way to innovate on top of the data streaming technology and provides value for our users, solving problems that are differentiated for Notion, and helping Notion developers so that we can move fast with innovation.”

Learn more about Notion

See more Customer Stories

logo-BigCommerce

BigCommerce

Confluent Cloud
logo-Audacy

Audacy

Connectors
Flink
logo-Expedia

Expedia

Confluent Cloud