Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now
Over the last decade, financial services companies have doubled down on using real-time capabilities to differentiate themselves from the competition and become more efficient. This trend has had a huge impact on customer experience in banking especially, and home mortgage company Mr. Cooper has built a powerful, multicloud streaming platform to capitalize on the power of Apache Kafka® and its ecosystem of integration and processing tools.
At last week’s stop in Dallas, Data in Motion Tour 2023 attendees heard from Noble Job, Vice President of Enterprise Architecture and Data Engineering, on why Mr. Cooper has made data streaming the default platform for all real-time needs for its various business applications.
Noble sat down with Confluent’s Addison Huddy to talk about how his engineering organization has enabled real-time data processing across Mr. Cooper’s many business systems.
Learn how Mr. Cooper has become a streaming-first organization—and Noble’s vision for the future of data streaming in financial services—in this Q&A recapping their chat.
Addison: One of the things that impressed me about Mr. Cooper is taking this approach of being “streaming-first.” That’s a very difficult step to take as an engineering organization. What do you think it means to be a “streaming-first” organization and what does that look like at Mr. Cooper?
Noble: At Mr. Cooper, we recognize that customers expect prompt notifications when making payments or submitting mortgage applications. However, within our industry, it is common for such notifications to arrive hours later, resulting in high call volumes, increased call center costs, and reduced customer satisfaction.
Regarding mortgage origination, we acknowledge numerous opportunities exist for reducing closing costs and time through automation and real-time decision-making capabilities.
We assessed our data engineering needs as part of our commitment to enhancing the customer experience and leveraging automation and data-driven decision-making. This evaluation revealed that deploying real-time streaming and engineering required more than traditional ETLs and batch processing.
Consequently, we made significant investments to establish a world-class real-time streaming platform, which serves as our default eventing platform across our origination systems, as the central nervous system.
To facilitate process improvements and enable automation across our loan origination system, we have relied on our real-time streaming platform to promote better data-driven decisions in real time. If we were to rely on batch processing, business events would take considerably longer to reach downstream systems, hindering our business processes' ability to make strategic decisions based on real-time data.
Addison: Having the ability to publish some data and then having multiple downstream systems have real-time notification—that's Kafka in a nutshell. There's also the ability to do a ton of stream processing on top of it, connecting all the different systems together. How else are you using Kafka’s pub-sub and streaming processing capabilities today?
Noble: Mr. Cooper currently employs stream processing extensively in its operations. Over 300 services are running in production that support Tier 1 systems, with nearly 100 managed and self-managed connectors enabling seamless data integration across systems.
To facilitate stream processing, we leverage Kafka Streams and ksqlDB to create several listening fields, with many streaming processors performing real-time change comparison between events. This approach enables us to compare data generated at different times, calculate deltas in real-time, and create numerous downstream business events for screening purposes.
Our stream processing approach allows for faster derivation of business insights from streaming data due to the significant volume of processors performing real-time screening.
We don’t use Confluent Kafka as a queue, but rather as a stream that has many events from different sources, performs simple to complex stream processing, and derive many streams which are then consumed by different applications, reports, and database systems
Addison: Processing the data as it comes through is so necessary because the real-time systems you’re describing have very low latency requirements. If you were to stream all that data out to another system, rather than processing it in flight, you’re going to disappoint the customer.
This whole movement of transforming operational and analytical data into multiple types of insights is really being driven by the high expectations that customers have today. What are some of the data streaming capabilities you’re looking forward to implementing? And what impact will these capabilities have on the customer experience that Mr. Cooper delivers?
Noble: Data streaming can significantly impact the customer experience, and Mr. Cooper recognizes this as a primary focus of its engineering organization. As a highly competitive industry, the mortgage sector demands that companies like Mr. Cooper improve their processes to serve customers better and become the most loved home mortgage company.
One example of how data streaming can enhance the customer experience is by triggering events that order all necessary verifications with external partners when a person applies for a loan. The corresponding data can be returned to Mr. Cooper's systems and processed in real time. This connected ecosystem with Kafka and Confluent Cloud enables the company to focus on customer efficiency.
As more business teams want access to raw data as soon as it comes in, we are building an event ingestion and discovery platform with Confluent and other ecosystems within the data space to allow internal users to access data and see how the market fluctuates and how customers respond to different products and services.
To accelerate the adoption of data streaming, Our real-time streaming platform engineering team is considering using more Confluent-managed connectors, which will increase the number of producer and consumer systems. As the company continues to mature its data streaming practice, it is also looking at how to govern the use of streaming data across the organization.
Addison: You touched on something that I think is really important that we’ve realized as a product team. Kafka is an amazing platform for enabling event-driven processes at scale—you need that storage and messaging capability that the open-source platform offers. But you also need extreme data governance, especially when handling financial data. We're really excited about developing more connectors, making the integration experience better, and continuing to improve on the new security features.
You've also talked about creating really advanced architectures within your Kafka cluster. Could you share more about these architectures and how they’re allowing you to share topics between clusters?
Noble: At the beginning of last year, we moved some of our data-focused products to the Google Cloud Platform. Google has been a strategic partner for us, so we decided to move our streaming clusters from Azure to GCP as part of that.
We wanted to complete this migration without bringing any downtime to other systems. We worked with Confluent to use Cluster and Schema Linking to migrate the business events, consumer groups, and schemas without downtime. And it was done so much more quickly than we ever expected.
Cluster Linking has also allowed us to achieve a significantly faster time to market. When we have hundreds of topics to migrate and exchange events, we want to avoid building some transformation or extraction logic to share that data with other systems or partner organizations. Instead, Cluster Linking allows us to scalably and reliably move sets of data copies from one company to another while also ensuring that our mission-critical Tier 1 systems don’t experience unexpected downtime.
Addison: I still remember writing the product requirements document (PRD) for that feature more than three and a half years ago. Now, the vision we had for Cluster Linking is catching up with reality as we’re seeing more customers realize value from using it.
I'm a big believer that more and more organizations will be multicloud. I’ve seen massive companies become multicloud overnight because when you’re using massive technologies, multicloud is somewhat of an inevitability. You have people realizing, “Oh no, the entire stack is on AWS and we’re on GCP. What do we do?”
A lot of what you just talked about, having this strong separation between generating and managing a cluster within a given cloud provider – that’s a good call out for people facing this kind of challenge.
With that said, I would describe Mr. Cooper as pretty far along the maturity curve of data streaming. You've been able to do that relatively short period of time. What advice do you have for the data streaming community at large on how to move your organization to a streaming-first approach?
Noble: It starts with a change in mindset. Because batch processing has been the norm for so long, when you present the idea of data streaming, many people will say, "Oh, I'm fine with receiving the data' x' minutes or hours later.
That thinking can be detrimental for organizations that want to achieve maturity with data streaming. In such cases, I've always recommended that organizations identify a data streaming use case that will solve a challenging problem for the business and then find a business partner to help you implement that use case the right way.
Being able to tell other teams, "I can give you the data to solve your customer or business problem in seconds," that's powerful. Once your internal users get a taste of that—getting access to data when it arrives and having the data move through ten different systems or reports in seconds—they'll come to you and ask for more. That's what we did at Mr. Cooper.
Leadership constantly challenges to improve processes, efficiencies, and the customer experience. Build a minimum viable product for a high-priority use case and prove how the TCO and ROI compare to what you're doing today. That's how you can demonstrate exactly what the business will gain from real-time data.
Addison: A data streaming platform lends itself so well to a distributed model, but we often see organizations. What can data engineering teams do so that they still have control necessary to govern data effectively without becoming a blocker?
Noble: For Mr. Cooper, we’ve focused on building a streaming platform that is centrally governed but in a federated partnership model with application engineering teams. To that end, we have a streaming center of excellence (CoE) at Mr. Cooper, with my distinguished engineers and architects running the show. We also include representatives from other business units and our application teams because they are the ones who have to leverage the platform.
I always encourage tech leaders to build a streaming CoE. Once you have this group organized, you can quickly identify and rectify the problems getting in the way of real-time data consumption and processing, prioritize projects, and get the most value from your data streaming platform.
A few key areas that I would recommend to other organizations who are looking to scale the real-time platform is to Establish clear guidelines and standards, Implement an event catalog and governance process, Foster a culture of collaboration with different application teams, Monitor and optimize the performance, Enable an observability and monitoring process.
See how innovators like Noble Job and organizations like Mr. Cooper are realizing the value of real-time data with stream processing: sign up to attend the Data in Motion Tour 2023 in a city near you.
To learn more about how data streaming transforms financial services, visit our financial services solutions page.
This blog explores how cloud service providers (CSPs) and managed service providers (MSPs) increasingly recognize the advantages of leveraging Confluent to deliver fully managed Kafka services to their clients. Confluent enables these service providers to deliver higher value offerings to wider...
With Confluent sitting at the core of their data infrastructure, Atomic Tessellator provides a powerful platform for molecular research backed by computational methods, focusing on catalyst discovery. Read on to learn how data streaming plays a central role in their technology.