Will LaForest, Field CTO at Confluent, recently sat down with Anne Steptoe and Srdjan Pejic from Wealthsimple, a firm that’s disrupting the traditional investment management landscape in Canada.
Wealthsimple focuses on providing its clients with smart, simple investing solutions, as well as retail banking and crypto services. As the company looked for new ways to provide better, faster services, Wealthsimple turned to data streaming in order to democratize data access across its organization and accelerate transaction processing.
Find out how Wealthsimple is using data streaming, powered by Confluent, to deliver a better experience for its clients in this Q&A recap.
Will: Let’s talk about some of the real-time use cases that your teams work on today. What are your roles, and how are you using Confluent’s Apache Kafka® service to implement real-time capabilities today?
Anne: As VP of Infrastructure at Wealthsimple, I oversee our IT Operations, Infrastructure, and Developer Platform Engineering teams. Together, our mission is to make sure that employees have the right standards, tools, and technologies they need to do their best work, which includes building, running, and scaling the underlying infrastructure for our back-end systems and real-time products.
Srdjan: I’m a senior software developer on the Book of Record team, which handles ingesting and managing transaction data from various services, as well as maintaining a general ledger of all transactions within Oracle.
It’s our responsibility to make sure that all of the real-time transaction data is compliant, accurate, and ready for downstream teams to use within their products or services. For example, the numbers that you see in the Wealthsimple app first come through my team.
These kinds of Wealthsimple features—managed investment accounts, stock trades, crypto, and banking services—are all built using microservices that sync their respective transaction data into a general ledger.
Many producers from various products and services continuously send data into a single Kafka topic. Then a single, big consumer reads and publishes that data to the general ledger. As a result, we can provide a stream of real-time data to synchronize statement processes across multiple customer-facing products and back-end services.
Will: Clearly, the connectivity piece is really important to Wealthsimple. How many of your existing data streams are being read by multiple consumers?
Srdjan: Quite a bit actually. Most of the data products we use keep us up to date with the price balance at any certain point in time. That data is actually fed through Kafka from the general ledger.
So, if your app makes a balance observation, and then the user adds some money to their disposable account, that data actually then goes to the ledger, comes back out, and reconciles so that you have that money ready and available for trading.
Will: Given that sort of dependency on real-time data, what made you want to start with Confluent versus self-managing Kafka at Wealthsimple?
Anne: One of our core values is “Ship it,” so it’s really important that our product engineering teams are able to move quickly. Three years ago, when we began evaluating whether to go with Confluent or open source Kafka, we realized that we really didn't want to deal with managing Kafka clusters.
Choosing Confluent allowed us to focus on our core competencies in building great financial services and have our engineers work on actually building features and products rather than trying to train or hire full-time employees to manage Kafka.
Will: Could you talk about some of the Confluent-specific capabilities and features that are important to Wealthsimple?
Srdjan: The first one that really attracted us was Schema Registry. One of the main reasons for having Kafka and Confluent as part of the infrastructure was that users and consumers need to have an API contract. And one of the best ways to create an API contract is using a schema-based approach.
Having schemas that could be managed as they evolved was really important to us. Then, connecting producers through a schema-based API allows you to easily add features to or remove features from a product while maintaining consistent data quality and standardization.
Will: It sounds like Schema Registry became the API for all your event-driven services today. Can you talk more about what you were able to do with data streaming before making the transition to Confluent?
Anne: For one of our key microservices, we used to rely on polling to get data. At that time, we could only process 2,000 order fills per minute, which really limited the company’s ability to provide competitive trading services for our clients.
We then moved that microservice to Kafka a few months before the GameStop craze in 2021. Once we moved to Confluent and used event streaming for that microservice, we were able to process more than 18,000 order fills in a minute. That was obviously a huge advantage for us when it came to serving our clients’ needs.
Will: As far as the technology side, what are some of the most impactful benefits that you realized with Confluent?
Srdjan: What’s really great is that using Confluent has allowed us to scale topics really easily for those use cases that need it. We have a few topics that are used for multiple use cases and products, and therefore have to handle a high load. With Confluent, it’s fairly easy to scale those out very quickly.
And being able to leverage Kafka’s replayability has been amazing for us. We have the assurance of knowing that the data is there without us having to do anything or worry about it during incidents and high-stress situations where we need to quickly scale up or down.
Anne: Also, the level of support has been incredibly impactful. If we have any issues, we can take advantage of Confluent’s responsive managed service rather than trying to fix things and manage Kafka ourselves.
Will: I know your teams at Wealthsimple are long-time users of data streaming. Can you give us a sense of some of the upcoming projects that you are looking at tackling next with data streaming, as well as some of the lessons you’ve learned along the way?
Anne: We still need to move some of our Amazon SQS use cases into Kafka. That’s something our engineering teams are working on now. Additionally, we have multiple products that we’ll be launching where we need real-time fraud analytics, so that’s something we’ll be looking to Confluent to help us implement.
Srdjan: Also, we’re aiming to get everyone at Wealthsimple to ship their data to the general ledger through real-time data streams, so we’re able to get data in and out very quickly. That’s the mandate of my team. As a growing tech company, taking in and iterating on data quickly is essential, and getting more teams to use real-time data streams to share and consume data is a big part of that.
We’ve built up expertise in showing teams how to leverage this strong API created with Confluent. So now, it’s just a matter of getting more teams to publish data to the general ledger.
Will: Any specific pearls of wisdom or lessons learned from your time using data streaming that you think others would really benefit from hearing?
Anne: I think we should have done more training on how to use Confluent, so we could have used it right, and made the most of its features, right from the beginning. We’ve improved a lot over the past three years, and we’ve established some guardrails so teams are set up to use the platform in the best way.
Srdjan: And from a development point of view, while I think Schema Registry is really good as an API guideline, I think there needs to be room for evolution over time. Confluent users need to really spend time investigating and learning the ins and outs of data governance.
So I recommend digging into the documentation on Confluent’s website and then diving in to learn by trial and error the best ways to organize schemas to govern data for your organization.
There’s so much more to learn about data streaming and the use cases that are transforming the financial services industry. To hear from data streaming experts, industry thought leaders, and innovators, register to attend Current 2023: The Next Generation of Kafka Summit, September 26-27 in San Jose, CA.
With Confluent sitting at the core of their data infrastructure, Atomic Tessellator provides a powerful platform for molecular research backed by computational methods, focusing on catalyst discovery. Read on to learn how data streaming plays a central role in their technology.
The insurance industry has undergone a massive transformation over the last 20-30 years. Customer service and business processes that were once done on paper or over the phone are now completed via web and mobile experiences. As a result, manual paperwork and tasks have gradually become...