Migrate from Kafka services to Confluent | Download Step-by-Step Guide

Building a Real-Time Service Marketplace with Confluent Cloud

Written By

As the world’s population continues to increase, so does the need for new infrastructure, buildings and housing, automatically increasing the demand for tradespeople. Whether it's an electrician, a plumber, or an HVAC technician, most of us have engaged a tradesperson at some point for our house, office, or other building and have been through the pain of finding the right person or company to do the job with best-in-class delivery. 

Ever imagined what pains a tradesperson goes through? Searching for a new job, managing multiple job schedules, generating the right quotations, ordering the right materials, and sending customer invoices are a few to name.  

In today's dynamic and interconnected world, the need for efficient and seamless communication between customers and tradespeople has become even higher. This requires innovation as well as seamless orchestration of real-time interactions in order for trade businesses to optimize ROI and efficiency. Whether you're a homeowner looking for skilled professionals or a tradesperson seeking new opportunities, harnessing the power of real-time event streaming can revolutionize how these connections are established and maintained.

Imagine a real-time service marketplace where job postings, bids, updates, and feedback flow effortlessly, ensuring that every interaction is timely, relevant, and impactful. This is where Confluent comes into play as a pivotal tool in architecting a robust and responsive tradesperson ecosystem. The data being collected is not only used for daily operations but its by-product, analytics, can be used to empower trade teams at every part of the job cycle, from making quick decisions to coaching and delivering quality jobs.

This blog outlines the challenges and solutions implemented by a popular service marketplace, detailing how Confluent Cloud, a fully managed Apache Kafka® solution, is used as a centralized streaming platform for event-driven processing. We will explore some of the advanced techniques like data integrity, security, and real-time analytics used in the platform, which enabled the creation of a scalable, responsive, and reliable ecosystem for tradespeople and their clients.

Challenges with the legacy platform

Traditionally, the service platform was a big monolith built on legacy architecture, which struggled with scalability and performance, especially as the user base grew and more tradespeople and customers joined. Outdated technologies lacked the flexibility and features required to meet modern user expectations, like real-time communication between tradesperson and customer, sharing geo-location, or a feedback and rating system to help customers make informed decisions. 

Job scheduling is the core of the platform, but there were issues creating an efficient and accurate algorithm that matches the right tradesperson with the right job, something with a high degree of complexity. It requires considering factors like location, skills, availability, and customer preferences. Handling last-minute changes, cancellations, and rescheduling adds another layer of difficulty.

The legacy nature of the payments system involved a lot of manual operations. There were concerns around security, efficiency, and compliance with regulations. The platform was not capable of handling various payment methods and integration with third-party services, such as payment gateways, mapping services, and background check providers. Ensuring smooth integration and maintaining compatibility with evolving APIs was a technical challenge.

Finally, maintenance and updates became complex with this large service marketplace, including security patching to adhere to user privacy protocols.

Addressing these technical challenges requires a skilled development team and careful planning to create a successful and user-friendly platform. Regular updates and improvements are necessary to keep the platform competitive in terms of features and technical infrastructure.

Requirements for a real-time service marketplace 

As a web-based solution, the service marketplace allows any customer to post a job for any location, to be done at a time of their convenience. On average, the marketplace gets nearly 50k job requests every day, globally. From job postings, bid placements, quotes, and offers, to status updates and feedback loops, there are millions of events generated every day within the platform, all orchestrated through Kafka's event-driven architecture.

1. Job Postings

Customers can post job requests in real time. Tradespeople should be promptly notified about new opportunities, ensuring timely responses.

  • Real-Time Decision-Making: Tradespeople need to quickly respond to job postings in order to win contracts. The platform should have low latency to ensure swift responses.

  • Scalability: As the number of job requests increases, the platform should scale seamlessly to handle the data volume, ensuring jobs are not left unattended. 

2. Quotes and Payments

Customers expect transparency when it comes to money and negotiations. A dynamic service marketplace should instantaneously reflect updated prices, apply discounts and marketing promo codes, generate quotations, provide clarity on profit margins, and track payments. Modernization of the finance function is the baseline that helps automate manual tasks. 

  • Data Enrichment and Transformation: The platform should have the capability to provide data enrichment by generating events to dynamically modify prices and track payments for a particular job, which enhances the user experience.

3. Job Scheduling 

There’s a popular saying in the trade business: “The bigger the project, the higher its chances of failing due to scheduling errors.” The scheduling engine must be efficient in taking both customer and tradesperson data into account while generating events specific to appointments, leads, etc. 

  • Data Accuracy: This is a basic expectation from the platform. Examples include: Aligning various team members to avoid delays, ensuring their availability matches with the customer’s availability, and scheduling jobs efficiently so that tradespeople can do more in a day.

  • Data Integrity: This ensures superior quality of service. Otherwise, if a contractor were double-booked or jobs were set in the wrong order (e.g., a painter is scheduled before the brick wall is built), delays or failure to deliver would shatter customer trust.

The scheduling engine also has features such as live voice, chat, and customizable online booking services that make it easy for service contractors to deliver convenient, reliable customer experiences and book more qualified jobs – anytime, anywhere.

4. Real-Time Status Updates

Today’s customers expect real-time updates. They track their packages, rides, and even pizzas via mobile apps. Isn’t it time they track their service professionals, too? A highly valuable feature is the ability to track the GPS location of the technician’s truck to know their estimated time of arrival, including SMS alerts to confirm the appointment, ask questions, etc.

  • Handling Real-Time Events: The platform should provide real-time visibility by sharing GPS location* and instant messaging to confirm or reschedule appointments. This provides a better customer experience, and for the trade teams, it optimizes route planning and workforce coverage.

*Location-based marketing and pricing are other features that can be added in future.

5. Feedback and Insights

Feedback loops are essential for quality assurance. The platform should also have the ability to aggregate job history, bids, and feedback to provide valuable insights for both customers and tradespeople. Tradespeople can use this data to coach their team and improve the overall experience.

  • Data-Driven Insights: Feedback and retrospective are vital to assess the project, ensure completion, and derive any lessons learned and best practices to be applied to future projects, enabling better decision-making and improved user experiences.

Lastly, resiliency is key for providing continuous operation – for customers to request new jobs,  for the scheduling engine to schedule and assign jobs, and for tradespeople to track their latest schedules. The platform must also be able to handle a growing user base and increase event traffic while maintaining high performance and availability along with ease of deployment.

Technical solution deep dive

An earlier version of the service marketplace was a monolith. To meet modern and real-time business requirements, it is being broken into independent and self-contained microservices that allow for easier deployment, testing, and maintenance. 

Today, Kafka is the de facto technology for event-driven microservices Confluent Cloud offers a fully managed Kafka service that provides infinite storage and enterprise-ready event streaming features such as data lineage, schemas, and advanced security — all while being more than 10x more elastic than open-source Kafka. 

To support a growing number of business use cases on this real-time service marketplace, the organization decided to move from open-source Kafka to fully managed Confluent Cloud. Workloads can be seamlessly migrated from OSK to Confluent Cloud using Confluent’s Replicator with little to no downtime or duplicate processing.

Confluent Cloud solution

The service marketplace is currently built on hundreds of SQL Server databases, so real-time datastore integration and data consolidation are needed. Data is read from these databases to Kafka topics using services like HVR/Qlik. Confluent provides a MySQL CDC Source Connector, though CDC is not a requirement in this case. Data enrichment is done using KSQL queries, which is then stored in MongoDB using Confluent’s fully managed MongoDB Sink Connector.

Kafka's publish-subscribe model enables seamless communication between customers and tradespeople. As soon as a customer posts a new job, the producer instantly sends a message to a topic. Multiple consumers can subscribe to this topic and process the event, and notifications are instantly sent to several tradespeople about the new jobs. High fanout ratio is needed in this architecture, something which is inherently supported on Confluent Cloud.

key:
{
  job_id:'jobId'
}
value:
{
 id:'jobId',
 customer_id: 'customerId',
 service_option: ['electrician','painting','carpenter','plumbing'.....],
 contractor_id: 'contractorId'
 job_location: 'address'
 job_startDate: date,
 job_targetEndDate: date,
 total_price: jobValue,
 currency: 'USD/AUD/INR....'
 ...
}

Value example
key: {'job_id': '34219d2ss3'} 
value: {'id': '34219d2ss3', 'customer_id': '102', 'service_option': 'plumbing', 'contractor_id': '2204', 'job_location': '183 Maple Drive', 'job_startDate': '08/18/2023', 'job_targetEndDate': '08/20/2023','total_price': 1000, 'currency': USD}

key: {'job_id': '74598e2wx4'}
value: {'id': '74598e2wx4', 'customer_id': '103', 'service_option': 'carpenter', 'contractor_id': '2203', 'job_location': '64 David Court', 'job_startDate': '01/25/2024', 'job_targetEndDate': '02/20/2024', 'total_price': 25000, 'currency': AUD}

Different topics are defined for different types of events (e.g., job status, quotes, payments, feedback, and sending notifications to customers). During the lifecycle of a job, hundreds of events are generated and processed using Confluent’s multi-AZ standard clusters on Azure, for example. The elastic nature of Confluent clusters, backed with infinite retention, alleviates operational concerns by reducing the time and cost to rebalance, expand or shrink clusters, and move data from databases to Kafka topics for processing, ensuring a smooth user experience with low-latency. Multi-AZ clusters provide resiliency and disaster recovery capabilities.

Different components of the platform use different data formats or versions. Ensuring seamless data serialization and compatibility between producers and consumers is crucial to prevent data compatibility issues when evolving the platform. Stream Governance on Confluent Cloud ensures stream quality and maintains a steam catalog across teams and different versions of producers and consumers.

Historical data, when combined with real-time information, increases accuracy and overall quality of customer experiences. Both real-time and historical data retained in topics can be analyzed to provide data-driven insights (e.g., avg. number of jobs completed, delays, manpower used) to drive timely decision-making, best practices, and greater customer satisfaction.

Financing, accounting, and R&D microservices receive frequent updates, generating a high volume of real-time event streaming data. These microservices write records to Kafka compacted topics to keep the latest state of the financial proceedings and store these records until payment settlement. Upon completion of payments, a tombstone message* is sent to the topic for its permanent deletion from the topic. (*A tombstone is a message with the same key and null payload, produced to the same topic and partition to delete previous record of this key from the topic.)

One microservice in the customer’s platform writes GPS data to Kafka, performs stream processing, and sinks to MongoDB using Confluent’s fully managed MongoDB Sink Connector. Using Stream-table joins of ksqlDB, truck location is tracked in real time and customers can view this on their mobile devices.

-- Create Table of JobLocation
CREATE TABLE JOBLOCATION (
    ID VARCHAR PRIMARY KEY,
    CUSTOMERNAME VARCHAR,
    EMAIL VARCHAR,
    ADDRESS VARCHAR,
    DEST_LAT DOUBLE,
    DEST_LONG DOUBLE,
    REPORTING_TIME BIGINT
) WITH (
    KAFKA_TOPIC = 'joblocation',
    VALUE_FORMAT = 'JSON',
    KEY_FORMAT = 'KAFKA',
    PARTITIONS = 6
);

-- Create Vehicle Stream
CREATE STREAM VEHICLES (
    ID VARCHAR KEY,
    JOB_ID VARCHAR,
    STATE VARCHAR,
    LAT DOUBLE,
    LONG DOUBLE
) WITH (
    KAFKA_TOPIC = 'tradie_truck_location',
    VALUE_FORMAT = 'JSON',
    KEY_FORMAT = 'KAFKA',
    PARTITIONS = 6
);

-- Create a truck tracking table for vehicle location and ETA.
CREATE TABLE TRUCK_TRACKER WITH (
    KAFKA_TOPIC = 'truck_tracker',
    PARTITIONS = 6
) AS
SELECT
    J.ID JOB_ID,
    LATEST_BY_OFFSET(V.ID) VEHICLE_ID,
    LATEST_BY_OFFSET(V.LAT) LAT,
    LATEST_BY_OFFSET(V.LONG) LONG,
    LATEST_BY_OFFSET(J.DEST_LAT) DEST_LAT,
    LATEST_BY_OFFSET(J.DEST_LONG) DEST_LONG,
    LATEST_BY_OFFSET(ROUND(GEO_DISTANCE(CAST(V.LAT as DOUBLE), CAST(V.LONG as DOUBLE), CAST(J.DEST_LAT as DOUBLE), CAST(J.DEST_LONG as DOUBLE), 'KM'), 2)) DISTANCE_FROM_DESTINATION,
    LATEST_BY_OFFSET(ROUND(GREATEST(ABS(V.LAT - O.DEST_LAT), ABS(V.LONG - O.DEST_LONG)) / (0.5 / 10 / 10) * 2, 2)) ETA_SECONDS
FROM VEHICLES AS V
JOIN JOBLOCATION AS J
ON ((V.JOB_ID = J.ID))
GROUP BY J.ID
EMIT CHANGES;

-- Fetch latest location data from truck tracking table
SELECT * from TRUCK_TRACKER;

Azure Cosmos DB (Postgres) was used to store change feeds. Now, streaming data pipelines can be built to ingest data from this DB to Kafka topics using Confluent’s fully managed PostgreSQL CDC Source Connector. ksqlDB transforms the raw CDC events from Azure CosmosDB into domain/business events suitable for downstream writes to MongoDB and Snowflake. An audit microservice can consume the transformed data for auditing purposes.

Lastly, the most critical piece of the business, the scheduling engine, uses Kafka as the backbone for messaging between microservices. All customer data, tradesperson data, contracts, and events specific to appointments, leads, etc. generate a lot of records that need to be updated in real time in order to align customer and tradesperson mapping correctly. Confluent’s infrastructure facilitates the movement of data between the frontend and backend microservices.

With the breadth of the solution, Confluent’s Terraform provider is used for deployment in CI/CD pipelines managing all components of Confluent, including clusters, topics, schemas, and connectors.

This modern streaming architecture has transformed a previously monolithic platform into a real-time service marketplace where security, scalability, flexibility, and maintainability are at its core, with greater decentralization of data management and responsiveness to changing business needs.

Conclusion: Unleashing developer productivity and driving revenue

One of the success measures for Confluent Cloud in this use case is to ensure that records are processed in order for each tenant and the chosen platform alleviates any noisy neighbor concerns where high volume from one tenant can cause processing delays for other tenants. (Read more: How to maintain message ordering and no message duplication). The organization has successfully validated that application performance has substantially enhanced using Confluent Cloud.

Remember that building a real-time service marketplace involves designing a complex distributed system. It's crucial to thoroughly plan the architecture, consider fault tolerance, and properly handle data consistency and integrity to ensure a reliable and efficient platform. Confluent Cloud’s operational excellence has reduced developers’ effort in managing infrastructure and they can now focus on building new features which keeps the product ahead of the competition.

Kafka's event streaming capabilities enable efficient microservice communication and data flow. Building a service marketplace platform using Confluent Cloud has helped this organization create a scalable and real-time means of connecting more customers with tradespeople while driving greater productivity and revenue. 

Resources: 

Ready to harness the power of your real-time data? 

  • Arpita Agarwal is a Senior Solutions Engineer at Confluent. Arpita comes with a wealth of experience from Enterprise software companies and Start-ups (all acquired) where she has architected and developed AI/ML based products from scratch. Most of her recent products were heavily based on Kafka, which she has been using since 2017 to build high-performance platforms with large scale data. She uses her in-depth knowledge of application architecture and Kafka in the current role, helping customers in ANZ build event-streaming platforms using Confluent.

Did you like this blog post? Share it now