[Webinar + Demo] BMW Group’s Omnichannel Transformation Using Data Streaming | Join!
When you are focused on protecting more than 500 million people worldwide, you’re always exploring avenues that can get you to the next frontier of empowering individuals against cyber threats.
So, when leadership realized the existing monolithic architecture was restricting McAfee from scaling its operations and driving innovation (monoliths inevitably become more complex over time), in February 2021, I was brought in to lead a replatforming strategy. This meant modernizing our existing on-prem infrastructure and migrating to a cloud-native microservices architecture.
Modern businesses are aware of the many benefits this approach can drive, including scalability improvements, improved fault isolation, reusability across business, and faster time to market.
Decoupling these microservices is key to success because it ensures businesses can eliminate failure zones by partitioning an application into independent services. And by embracing an event-driven approach to microservices, it provides a simple and powerful way to enable this. This helps with scaling systems and each area independently, building in resiliency—and ensuring higher availability.
Event-driven microservices also allow us to react to events in real time—allowing us to instantaneously detect security breaches and protect our customers. Today, we see our event-driven microservices architecture as the central nervous system of our company—helping us power all other initiatives.
Although replatforming sounds like a great idea, the initiative itself comes with huge hurdles requiring buy-in from all parts of the business. Based on my experience, five key strategies can get buy-in from leadership and convince them that migrating to an event-driven microservices architecture is the right choice for your organization.
While replatforming means you spend money upfront, ultimately, you'll be saving hundreds of millions of dollars for the company over the next three to five years.
Our transition from monolith to event-driven microservices is saving us significantly in hosting costs alone, for example.
But leadership isn’t going to wait years to see savings, which is why it’s important to show the business immediate benefits with small wins along the way. Focus on the business capabilities a microservices architecture will help drive, including increasing business velocity (accelerating the speed of delivering new features, functionalities, and products to market) and achieving high availability of systems (because downtime can result in lost customers).
For example, we built a new small business product recently in about three months—which would have taken us nine to 12 months to build with our old system. Wins like this will help leadership gauge the long-term benefits of implementing such a transition.
To sum it up, delineate how embracing a microservices architecture will ensure faster time to market, reduce the cost of maintenance, and help your business keep up with technology changes.
At McAfee we work a lot through our partners, including integrated service providers (ISPs), retail, and original equipment manufacturers (OEMs). And when it came to enabling a partner, it took us several weeks and months to actually get started—even if we were doing things that didn’t require significant customization.
We needed to show the business how moving to event-driven microservices would help us with standardization—which ultimately improves efficiency—and help us expedite our partner's onboarding time.
Second, as a company, we were trying to shift our focus from device centricity to user centricity. This required a shift in mentality: Instead of thinking about protecting our user’s device, we needed to think about protecting them as a person—their identity online and everything else. So, we needed to build our business case to demonstrate how our old systems were hindering us from making that shift, and how this new architecture would help drive the right business model.
Back in 2012, I was brought into Walmart to drive a similar initiative: getting Walmart.com e-commerce out of the monolith and into microservices. But for a business the size of Walmart—it had to work on day zero. And it did—when we turned on the system and started taking traffic in 2014.
You cannot disrupt the business and say something might not work. That's not an option. And that holds true for any business, irrespective of their size.
The foundation and principles in these scenarios are centered around cloud-native microservices architectures that allow for the isolation of concerns (reducing the blast radius of any change and reducing regression cycles), scalability, supporting high velocity of change, testability/automation, availability, and monitorability.
Microservices, when done correctly, like in event-driven architectures, drive decoupling, break dependencies, and allow for independent scaling of different parts of the software ecosystem. You will encounter problems and hiccups along the way, but by having these small, decoupled building blocks, you will be able to quickly fix those issues and roll forward. Fixing most of the issues that will show up will either involve extending existing APIs or introducing new missed functionality through additional microservices.
If you take a shortcut and take on technical debt, that’s OK, provided you come back and fix it quickly, and that tech debt is also isolated so the rest of the system is not having to change when you are retiring that debt.
In our microservices transformation journey, we also undertook significant modernization of the legacy stack—pretty much changing out everything, including the language the tech stack was written in, high level of automation of provisioning, deployment (CI/CD, infrastructure as code), moving to containerization/orchestration and to maximize the use and benefits of cloud-native stacks/architectures.
My initial step as we undertook this journey was to make the business realize that modernizing the technology stack was no longer an option but a necessity.
Another important aspect when it comes to making the transition from monoliths to microservices is to think of it in terms of phases. Phasing allows you to have confidence building checkpoints and deliverables along your transformation journey, proves business benefit, and extends the support for the rest to come, through constant reinforcement for what you set out to achieve.
New business initiatives/deliverables on the new architecture allow you to prove the architecture, and help with the rest of the transformation.
Think about your migration from the old to the new system—it has to be seamless to the end customer. Do it in a way that you are building bridges to the old system, then you start knocking down those bridges over time and removing those crutches to finally complete that migration.
Apart from running platform engineering and protection technology, I also lead the architecture team at McAfee. A big part of that is playing a role in selecting the appropriate technology stacks, guiding design decisions, and helping drive what we buy, when we buy (instead of build).
We are heavy users of managed services in the public clouds (we use both AWS and GCP), we use a lot of open source, and we license the best fit in the platform jigsaw puzzle of solutions that we stitch together. We also have a technology or solution-agnostic approach through our platform layer that allows us to swap out these underlying IaaS and PaaS building blocks without impacting the services and applications.
I really encourage all of you to take a serious look at managed services. Amazon RDS provides the databases we need as managed services—Amazon DynamoDB, managed Prometheus, Amazon CloudWatch, etc. We are also users of managed SaaS solutions outside of our public cloud providers, like Auth0 Customer Identity, Access Management, and Confluent.
I am going to use Confluent as a means to highlight the benefits of using managed services. Previously, we had different flavors of eventing, including open source Kafka, Amazon Kinesis, Azure Event Hub, and our goal was to consolidate. And we chose to go with Confluent. There are several reasons that drove us to Confluent.
First, I had an architect on the team who had prior experience with using Confluent.
Second, I did not want to spend our engineering cycles trying to run and manage Kafka. Confluent Cloud was a perfect fit for us because it reduced the number of people we needed to manage Kafka, and it reduced the Kafka skills gap that I was seeing. Confluent allowed my engineers to focus on value-added tasks and actually build the protection capabilities that we offer as a company.
Third, we found the right fit with Confluent, specifically with the level of governance and its security posture. And most importantly, it enables us to take scalability to the next level.
Go through a decision tree for every piece of technology you intend to use similar to what we did. Document it. It helps clarify the why and allows you to come back and justify or revisit at any later stage.
Want to learn more about how McAfee is driving a successful monolith to event-driven architecture transition? Then watch the webinar to take a deep dive into best practices for a successful transformation—and get a sneak peek into our future roadmap.
To reliably prevent malware threats and phishing scams, get privacy and identity protection for your digital presence, and prevent your data from being compromised, who do you turn to for solutions? Probably McAfee...
While generative AI is driving the need for stronger data governance, it can also help to meet that need.