Show Me How: Build Streaming Data Pipelines for Real-Time Data Warehousing | Register Today
Every 2 seconds, another person becomes a victim of identity theft. The number of online account takeovers is constantly increasing. In this talk we'll show how stream processing was used to combat this for Tesco, one of Europe's largest retailers. The massive scale of e-commerce makes it an interesting target for malicious users. We implemented a risk-management platform built around Kafka and the Confluent Platform to detect and prevent attacks, including those that come through the website's authentication page. We'll present how this project evolved over 2 years to its current state in production, together with some of the challenges we encountered on the way. As the project has had a couple of phases, we will see and compare alternative designs, summarize their pros and cons, and refer them to well known techniques - like Event Sourcing. We'll discuss the architecture and integration with external systems, before moving onto a detailed examination of the stream processors implementation and key internals such as co-partitioning of data. We'll also cover the role of stack components that we used, including Kafka Connect and Schema Registry, as well as the deployment platform, Kubernetes. Over the course of the talk we will put special emphasis on highlighting key factors to take into consideration when designing data pipelines and stream processing platform. For more resources, read the blog and check out the podcast.