Why Mercari’s Marketplace App Needs Real-Time Data
Today, Mercari, Inc. is Japan's largest consumer-to-consumer (C2C) marketplace operator. Since its establishment in 2013, the company has continued to grow through the development of the Mercari app, which has approximately 23 million monthly users.
In recent years, the company has further expanded its business domain by launching:
The Merpay mobile payment service
Mercoin, a service that allows users to purchase bitcoin with sales proceeds earned within Mercari
Mercari Hallo, which allows users to find on-demand work on their smartphones
These community-powered apps and services all serve to further Mercari’s central mission and vision: empowering people to maximize their resources while building a circular economy at global scale. In service of these goals, Mercari generates a vast amount of data on a daily basis—from financial transactions and inventory changes to customer data, online user actions, and more.
Mercari’s Data Platform team is responsible for building and operating the infrastructure needed to effectively utilize such data. Guruprasath Thavamani, Engineering Manager on the Data Platform team, explained:
“Mercari collects a wide variety of data from multiple microservices and databases for market analysis, marketing, fraud prevention, and other use cases. Our team is working to build and operate the best infrastructure for this purpose.”
Guruprasath Thavamani
Engineering Manager, Data Platform, Mercari
Learn how adopting Confluent helped Mercari more easily implement real-time change data capture (CDC) to power more efficient business operations.
Learn how Confluent helped Mercari more easily implement real-time change data capture (CDC) to power more efficient business operations.
CDC pipelines allow organizations to capture new, deleted, and modified data in environments like databases, data lakes, and data warehouses. This was an essential capability for Mercari and its Data Platform team, which is responsible for ensuring the integration and availability of data across systems and applications.
As Mercari expanded, its product and service teams built more applications in-house using Kubernetes for microservices orchestration and BigQuery for data warehousing.
The Data Platform team chose Kafka to implement CDC pipelines because the architecture of the distributed data store is ideal for capturing and processing high-throughput data streams. Additionally, the Kafka ecosystem had tools the team could use to improve availability (e.g., data replication, partitioning, data persistence) and facilitate flexible connectivity and data transformation between Kafka and a variety of data sources (e.g., Kafka Connect, Kafka Streams).
Despite these advantages, the Data Platform team soon ran into productivity roadblocks in their development timelines.
"We lacked the resources, such as specialist engineers to operate Apache Kafka in-house, so we needed a system that would minimize the burden of implementation, construction, and operation,”
— Tomoyuki Nakamura, Senior Software Engineer, Data Platform, Mercari
Migrating to Confluent to Ease Kafka Operations and Accelerate Development
The solution to Mercari’s operational challenges and development roadblocks was Confluent Cloud , a data streaming platform powered by a cloud-native Kafka engine. Combined with built-in integration, governance, and stream processing, this fully managed Kafka service gave the Mercari team everything they needed to advance their real-time capabilities.
Using Confluent Cloud, the Data Platform team has been able to quickly build and deploy CDC pipelines and accelerate progress on operational use cases like fraud detection and marketing analytics—all without having to worry about unexpected downtime or manually rebalancing Kafka clusters.
That ease of use and the freedom it gives the Data Platform team was the deciding factor in adopting Confluent. Tomoyuki Nakamura recalls, “When I saw how quickly Kafka clusters could be built and rebuilt in Confluent Cloud, I was amazed at how easy it was."
“Confluent makes it easy to build and operate an Apache Kafka environment using the Web UI and Schema Registry. Indeed, thanks to Confluent, the burden of building and operating the CDC infrastructure has been greatly alleviated, allowing us to focus on development work that helps drive the business of our business units.”
— Guruprasath Thavamani, Engineering Manager, Data Platform, Mercari
Since then, Mercari has used Confluent to power a number of real-time use cases, including the increased sophistication of its fraud detection capabilities.
Ryohei Nagao of the Trust & Safety Engineering team shared the impact this migration has had on his team’s work, “Our team monitors and responds to prohibited activities on a daily basis to ensure that our customers can shop safely and securely. In response to such circumstances, Mercari needed to build a system for real-time fraud detection, especially as the number of users increases and services have expanded in recent years.”
He went on to explain, “With Confluent, we can now easily build the CDC pipelines we need to acquire data in real time rather than retrieving it in batches every 10 minutes, enabling us to detect fraud quickly.”
Confluent has also enabled Mercari to achieve greater customer satisfaction. In the marketing department, streaming data from multiple in-house microservice databases is connected to Confluent and sent to BigQuery, Mercari’s data warehousing solution for analytics, in real time, and notifications are sent to customer relationship management (CRM) services downstream.
Previously, Mercari would retrieve customer business transaction data by batch processing once every 10 minutes. Depending on the timing of these batch processes, the incentives and coupons that should have been given to customers were not applied. With Confluent, customer and transaction data is updated in real time, and incentives and coupons are applied at the most appropriate time for Mercari's customers.
What's Next for Mercari and Confluent
By adopting Confluent, Mercari has been able to use real-time data without adding to its development timelines or increasing its operational burden.
“With Confluent, we have not experienced a single outage due to failure since its introduction, and it has continued to operate stably. Such high reliability is its biggest selling point.”
— Tomoyuki Nakamura, Senior Software Engineer, Data Platform, Mercari
As a result, the Mercari team plans to expand its usage of Confluent Cloud and is now working on integration with TiDB Cloud, which is a potential candidate for a new database environment.
The Data Platform team also plans to start using Confluent’s managed Apache Flink® to support stream processing use cases in Japan that are already underway overseas. Guruprasath Thavamani shared, "We are looking forward to offering Apache Flink's managed services in Japan, which are already available overseas. I believe that using them will drive further operational efficiency and automation.”
Get Started With Confluent Today
New signups receive $400 to spend during their first 30 days.