Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now
According to the World Bank, around 56% of the world’s population currently live in cities. By 2050, it’s predicted this will rise to 70% (with the global population more than doubling in the same time frame). This acceleration toward urbanization is putting city infrastructure under enormous strain. Congested highways, unreliable public transport, failing waste management systems, and sporadic urban planning are just a few consequences of our collective influx into cities.
Some cities, however, have turned to technology in order to deal with the challenges of growing urban populations. These have become known as “smart cities,” and they leverage advancements in sensor technologies, cloud computing, and data infrastructure to better serve residents, visitors, and businesses.
In this blog, we’ll focus on the data infrastructure element of the smart city revolution. We’ll start by looking at some real-world examples of smart city applications and their impact, before considering the challenges they pose in terms of data infrastructure. We’ll then narrow the focus to the topic of data streaming, demonstrating how Confluent addresses these challenges and enables city administrations to harness real-time data for better public services.
While this blog contains some technical details, we hope it’s equally informative for nontechnical readers with an interest in the role of real-time data in smart city applications.
At the core of all smart city applications is data. With the proliferation of new sensor technologies to capture data, and new ways to process and transmit that data, a myriad of smart city applications have emerged to improve various aspects of city life. Here are a few examples.
The fluid movement of people and goods is fundamental to the well-being of a city, and data plays an increasingly important role in facilitating it. Intelligent Transportation Systems (ITS) capture data via connected video cameras, radio frequency identification (RFI) tags, and automatic detection and data collection (AIDC) tags (to name just a few sources) in order to deliver a wide range of use cases.
An accurate, real-time view of traffic flows, for instance, can be used as the basis of smart traffic light systems or route optimization platforms. These tools respond to traffic conditions (i.e., quantity of traffic, incidents, inclement weather, etc.) as they occur, and modify driver behavior via instructions or recommendations to ensure that travel continues to flow as seamlessly as possible. Similar technologies have also been applied to parking (e.g., barrierless parking lots), toll road management (e.g., automated payments), and public transportation management (e.g., real-time fleet tracking and route planning).
In Wales, Japan, and Germany, advances in data collection and transfer have led to trials of a new form of public transport called “demand-responsive transportation (DRT).” With DRT, travelers request a ride to a certain destination via a mobile app and are matched with fellow passengers traveling to a similar place (in the same way as UberX Share). This can help to reduce the number of underutilized public transport vehicles on the road while enabling travelers to get exactly where they need to be.
Smart city applications like these, enabled by various technical innovations, are making urban travel faster, cheaper, and safer, while reducing its environmental impact.
The same advances have also led to smart city applications aimed at reducing the energy consumption of public infrastructure. “Smart street lighting,” for example, reduces the demand for electricity by activating only when needed (i.e., when lights detect a person or vehicle in the vicinity) and in proportion to the luminosity of natural light (i.e., the intensity of light changes according to whether it’s bright or cloudy). It achieves this by collecting data via light, motion, and weather sensors on each lamp, and processing that data at the edge in order to respond to conditions in the moment.
The implications of such systems are significant. In Barcelona, for example, the city council has reduced their urban lighting energy consumption by 30% as a result of their smart lighting system (which has now been operational for over a decade).
Similar efficiency gains are being achieved in waste management organizations. LoRaWAN-based smart sensors, for instance, can measure the fill level and content weight of individual waste bins, and communicate this to a centralized database. This data can then be processed in order to automatically optimize the routes of waste collection trucks, ensuring that they only collect bins that are full. This has the effect of reducing unnecessary bin collections (saving time and money), while also reducing the chance of overflowing bins on streets. In San Francisco, a smart bin trial led to an 80% reduction in the number of overflowing trash cans.
IoT innovation is also being applied further downstream at waste sortation centers. It’s now possible to combine the use of terahertz and infrared sensors with HD camera feeds on sortation lines in order to correctly identify the material, quality, and age of waste items (degraded materials have less recyclable value). Once items are identified via data generated and processed “at the edge” by these devices, they’re then physically moved by pressured air hoses to the relevant waste sortation belt. Such a process, enabled by the flow of real-time data from different devices, significantly reduces manual intervention in sortation (again, saving time and money) while increasing the amount of material that is ultimately recycled.
Learn how to build streaming data pipelines →
It’s clear that IoT devices and the data they capture are fundamental to smart city applications. Perhaps less clear, though, is the role of their underlying data infrastructure. An organization's choice of message broker and data processing framework, in particular, can have a significant impact on the functionality of a smart city application. While each solution will have its own unique challenges, these are the most common when considering IoT applications as a whole.
Smart city applications rely on large volumes of heterogeneous data sources, processed in real time, in order to function. The integration of this data can be challenging for certain technologies and protocols. MQTT (Message Queuing Telemetry Transport), the standard protocol for IoT applications, is perfect for lightweight communication between thousands of clients in unreliable networks, but wasn’t designed for data integration. It requires a broker or technology that is, which necessitates stream processing. Without stream processing, data integration is pushed to batched processes on downstream datastores. This prevents heterogeneous data sources from being integrated and processed for applications which require “soft real time”(i.e., < milliseconds).
The generation of data within smart city applications can fluctuate greatly. During peak commute times, for example, smart sensors connected to transport systems generate significantly more data than during non-peak times. Message brokers need to be able to scale elastically in order to meet this variable demand. When they fail, they need to have solid disaster recovery mechanisms to ensure that the applications continue without disruption. And finally, they need to be deployable in a number of environments: at the edge, on data centers, and across public (where possible) and private clouds. This deployment flexibility helps to prevent data silos across an organization and allows engineering teams to choose the most appropriate environment for their application.
Increasingly more organizations are addressing these challenges by adopting data streaming, a method of continuously processing data as it is generated or received, and making it available for real-time triggers or analysis. Apache Kafka® is the de facto technology for data streaming and is used by over 70% of Fortune 500 companies.
Confluent, which is based on Apache Kafka and powered by the Kora engine, is the enterprise-ready data streaming platform that is cloud-native, complete, and available everywhere. Confluent is used alongside other technologies to deliver various smart city applications.
One such technology is MQTT. Working together, Confluent and MQTT can stream, process, and govern smart sensor data from thousands (or hundreds of thousands) of clients in real time in order to deliver smart city applications. Below is a high-level example that demonstrates how.
In this example, different sources of data are integrated and processed in real time in order to provide emergency vehicles with navigation recommendations based on current road and traffic conditions.
Data generated by CCTV cameras is transmitted via 5G to the organization’s application gateway hosted on a private or public cloud, alongside crowd-sourced information (i.e., incident reports), weather reports, the current position of the emergency vehicle, and the requested destination. This data is then streamed to Confluent Cloud via an MQTT Source connector.
Once the data is on Confluent Cloud, Apache Flink is used to integrate and process the different streams of data in order to calculate current levels of congestion across a network and detect whether a vehicle should be redirected based on those conditions. If it does, a route calculation service determines the best route based on current traffic conditions, and transmits that route back out in near real time to the vehicle’s on-board navigation system.
This infrastructure allows for emergency vehicles to reach their destination in the shortest possible time, increasing the chances of a positive response. It reduces the demands on emergency call centers by automating the navigation process, providing recommendations on information captured, collated, and processed via Confluent’s data streaming platform.
From a technical perspective, there are multiple advantages of building an adaptive navigation system on Confluent, including:
Broad integration – Confluent integrates with a wide range of technologies. Aside from supporting different IoT protocols, Confluent offers over 80 fully managed connectors, helping to future-proof the platform as new smart city technologies emerge.
Low latency – Powered by the Kora engine, Confluent streams and processes data from divergent sources in “soft real time” (i.e., at a sufficiently low latency for most smart city applications). This is critical for use cases, like adaptive navigation, which require responses within seconds or milliseconds.
Resilience – Apache Kafka is fault tolerant by design; its distributed architecture allows it to elastically scale to handle large volumes of data and avoid downtime. Data can be replicated across multiple brokers, meaning that applications aren’t disrupted in the event of a failure. Confluent provides 10x the resiliency of open source Apache Kafka, with a 99.99% uptime SLA and geo-replication via Cluster Linking.
As more of us live in urban areas, the role of smart city applications will only become more important. Pushed by advances in sensor and networking technologies, these applications will touch almost every aspect of our lives, from how we travel around a city to how we access public services.
At the center of this transformation is data streaming. As more data is generated by a growing range of IoT sensors, data streaming will provide the means for processing it in real time, powering the applications which regulate urban life.
As a cloud-native, complete data streaming platform which is available “everywhere” (on-premises and in every major cloud provider), Confluent is the foundation of smart city applications.
To learn more about Confluent’s role in IoT applications, please see the following resources:
This blog explores how cloud service providers (CSPs) and managed service providers (MSPs) increasingly recognize the advantages of leveraging Confluent to deliver fully managed Kafka services to their clients. Confluent enables these service providers to deliver higher value offerings to wider...
With Confluent sitting at the core of their data infrastructure, Atomic Tessellator provides a powerful platform for molecular research backed by computational methods, focusing on catalyst discovery. Read on to learn how data streaming plays a central role in their technology.