Apna Scales to 30M Users with Confluent’s Data Streaming Platform

"Using Confluent’s out-of-the-box tools really helped us to focus on building new things, rather than solving problems in-house that have already been solved for us at scale by Confluent."

Suresh Khemka

Head of platform engineering and infrastructure, Apna

Apna is at the heart of a digital hiring startup success story that continues to unfold in India, home to the world’s second-largest workforce of over 470 million people. Founded in 2019, Apna is a job marketplace and community designed to provide employment, networking, and upskilling opportunities for India’s blue- and gray-collar workers. The Apna platform currently connects 30 million job seekers to 400,000 employers across 75+ cities—and those numbers are steadily growing!

With the help of an algorithm designed by the company’s software developers, Apna’s hiring platform is able to match people to relevant jobs based on their knowledge, experience, and interests. Beyond hiring, Apna also provides users with a space to share knowledge relevant to their industry—an opportunity which has historically been scarce for India’s blue- and gray-collar professionals. “We provide peer networking to people across all verticals. From plumbers to carpenters to beauticians, you name it—they can all share tips and knowledge that will help others in their network grow their skills,” said Suresh Khemka, head of platform engineering and infrastructure at Apna.

The Challenge: Limited scaling with cumbersome monolithic architecture

When Apna first launched its platform, its job matching and networking services were all powered by a monolithic back-end architecture. This approach quickly became unsustainable as the number of platform users skyrocketed in a matter of months. “We had reached the zenith of vertical scaling with our monolithic architecture, both with respect to application and data store. It was a huge setup and a pain to manage. Development had also started to lag because of the bottleneck created by that single, monolithic structure,” explained Ravi Singh, principal architect at Apna.

After investigating options for a new approach to its data infrastructure, Apna’s architecture team, led by Singh, decided to pursue an event-driven architecture powered by Apache Kafka, an open-source distributed data streaming technology used for real-time data pipelines, data integration, and stream processing. Singh’s team surmised that event-driven systems would provide the scalability, responsiveness, and agility needed to deliver the hiring platform’s services and react to information in real time. The question now became: Did the company want to self-manage its new Kafka architecture or enlist the help of a fully managed Kafka service to ease the transition from a monolithic back end to an event-driven one that readily supported microservices?

“As a growing company, we didn’t want to invest a lot of our time or developer resources into the critical task of managing and scaling up Kafka. The learning curve required would have distracted the team from the core mission of connecting people to opportunities,” said Singh. With this in mind, the answer seemed clear. To keep its developers’ focus on building and enabling opportunities for the platform’s users, and away from scaling and managing open-source Kafka, Apna opted to pursue the route of adopting a managed Kafka service.

The Solution: Powering Kafka with Confluent Cloud for an agile, event-driven architecture

To achieve the scalability needed to serve their growing user base and free up developer resources from overseeing Kafka operations, Apna chose Confluent Cloud, a fully managed, cloud-native Kafka service. Since implementing Confluent Cloud in a GCP environment, Apna has largely transitioned away from its monolithic roots and is powering several critical microservices with real-time data streams in the cloud. A few of the company’s most high-impact use cases include:

Job matching

Using an algorithm and real-time analytics, this feature harnesses data that job seekers provide about their skills and experience to intelligently match them with relevant job postings.

Job searching

Allows job seekers to search for the latest job postings using a variety of criteria such as titles, company, skills, salary, location, etc.

Application tracking

Allows employers to search for candidates who have applied to job postings and sort them using multiple attributes.

Community feed

Allows users to build a professional network and interact with them in vertical communities.

Data lakehouse

Apna used Confluent Cloud to build a data lakehouse that generates 20TB of data per month on Kafka. With this new architecture, Apna is able to fuel faster, more advanced analytics of their platform data in order to deliver customized user experiences in real-time. They use Confluent’s Schema Registry for schema management and MongoDB Kafka connector for change data capture in their lakehouse environment.

Business Results: Faster development, unparalleled reliability, and room for growth

Since adopting Confluent Cloud and migrating away from a monolithic back end, Apna is now seeing the benefits of an event driven architecture capable of fueling their platform services with real-time data. Some of the highest-impact outcomes include:

Faster time to market for solutions

Building custom connectors from scratch can slow solution development. To mitigate this, Apna takes advantage of Confluent’s selection of over 120+ pre-built connectors, which are designed to easily integrate Kafka with other applications and data systems—no new code required.

“We are now able to develop new products and services faster because of how easily Confluent’s connectors allow us to pipe data from our process build databases into Kafka, then load it into our data lake. We’ve been able to build solutions within 3-3.5 months. Previously, it would have taken us at least double that amount of time. Using Confluent’s out-of-the-box tools really helped us to focus on building new things, rather than solving problems in-house that have already been solved for us at scale by Confluent,” explained Singh.

Top-notch reliability

Kafka outages and downtime can severely impact Apna’s platform functionality and the user experience. But with Confluent Cloud’s cloud-native elasticity and 99.99% uptime SLA, Apna can rest easy knowing there will be minimal disruption to their Kafka environment.

“Since moving from a monolithic architecture to an event-driven architecture with Confluent, our queueing system is incredibly stable, and the reliability and availability of the Confluent solution is topnotch,” said Singh.

Scalability for future growth

With Confluent Cloud, Apna is able to scale to support their flourishing user base while offloading the burden of Kafka operations to Confluent’s managed service.

“In an effort to scale, we tried multiple options like Pub/Sub in GCP and other messaging solutions, but we always encountered limitations in terms of capabilities. It was at this point that we decided Kafka is the right technology, but we wanted to enlist the help of Confluent as a managed service provider to prevent complications that can arise with self-managed Kafka. With Confluent, we don’t need to ask ourselves, ‘How do we ensure enough throughput to meet our needs?’ The elastic scaling we’ve seen with Confluent Cloud has helped us a lot,” said Khemka.

Higher developer productivity leads to innovation

With less focus needed for Kafka operations and management, Apna’s developers are able to spend more time innovating. For example, Apna has been able to dedicate time saved from Kafka maintenance into development of a new learning assessment feature for platform users. This feature aggregates data about users’ self-reported proficiency in a variety of skills and makes intelligent recommendations for learning courses designed to improve those skills. Upon completion of these courses, users are presented with certificates they can share with employers and the community.

“By removing the need to manage Kafka systems in-house, we’re reducing the cognitive load on our developers and allowing them to spend their time building critical business solutions and automating other tasks. With Confluent doing all the heavy lifting for us in terms of Kafka infrastructure maintenance, we’re able to increase the efficiency and productivity of our devs, while providing a better developer experience,” explained Khemka

The Future: Increased Confluent adoption and improvements to the platform experience

Apna anticipates that as it continues to grow, so will the number of Confluent Cloud users within the company. “The number of Confluent users at Apna has quadrupled in the last year since we onboarded the service. And that’s largely due to the fact that we realized the solution works so well. It’s stable, scalable, and offers reliable performance. Those are all the characteristics we need as we continue our journey from monolithic architecture to microservices and event-driven use cases,” said Khemka.

The Indian company also predicts that Kafka, managed by Confluent Cloud, will soon be at the heart of all of their data systems. “In a few months, Kafka and Confluent Cloud will be like a central nervous system for all of our data. This is very important for us because we want to be able to reliably access and use real-time data to fuel key use cases like our job matching and recommendation features. Our data is our key differentiator,” explained Khemka.

Apna is also focused on developing new community features and partnerships that will help connect job seekers with the training and upskilling resources they need to increase their eligibility for roles in their vertical. The company plans to seamlessly scale Kafka using Confluent Cloud as their use cases multiply. “Data consumption will continue to grow as we build additional platform services and leverage data streaming more and more with Confluent Cloud. We’re really focused on improving our jobs marketplace as a whole. Better upskilling opportunities, better targeting by vertical, better job matching, better everything. It’s an ongoing process that we undertake continuously,” concluded Khemka.

Learn More About Apna

Get Started With Confluent Today

New signups receive $400 to spend during their first 30 days.

Get Started Free Learn More

See more Customer Stories

MPL

MPL Enhances Trust and Security of the Platform with Real-Time Data Streaming from Confluent

Learn more

Confluent Cloud

Buzzvil

Buzzvil Seamlessly Manages Massive Volumes of Real-Time Ad Data with Confluent Cloud

Learn more

Confluent Cloud

Meesho

Meesho Democratizes E-commerce with Real-Time Data Streaming from Confluent

Learn more

Confluent Cloud

How Apna, India’s Largest Hiring Platform, Went from Monolith to Microservices with Confluent

The Challenge: Limited scaling with cumbersome monolithic architecture

The Solution: Powering Kafka with Confluent Cloud for an agile, event-driven architecture

Business Results: Faster development, unparalleled reliability, and room for growth

Higher developer productivity leads to innovation

The Future: Increased Confluent adoption and improvements to the platform experience

Get Started With Confluent Today

See more Customer Stories

MPL

Buzzvil

Meesho

Get Started With Confluent Today