Kora Engine, Data Quality Rules und mehr in unserem Q2 2023 Launch | FĂŒr die Demo registrieren

What is a Distributed System?

Also known as distributed computing, distributed systems are a collection of independent components located on different systems, communicating in order to operate as a single unit.

In this complete introduction, learn how distributed systems work, some real world examples, basic architectures, the benefits and disadvantages, and common solutions for distributed messaging/streaming.

Founded by the original creators of Apache Kafka, Confluent is a complete data streaming platform for real-time data integration, stream processing, and analytics that connects 120+ data sources.

What is a Distributed System, and How Does it Work?

Verteiltes System – Definition

Ein verteiltes System, auch bekannt als verteiltes Computing und verteilte Datenbanken, ist eine Reihe unabhĂ€ngiger Komponenten, die sich auf verschiedenen Rechnern befinden und Messages miteinander austauschen, um gemeinsame Aufgaben zu erfĂŒllen.

Auf diese Weise erscheint das verteilte System dem Endbenutzer wie eine einzige Schnittstelle oder ein einziger Computer. Das Ziel ist, dass das Gesamtsystem Ressourcen und Informationen maximieren und gleichzeitig AusfĂ€lle verhindern kann. Wenn ein System ausfĂ€llt, hat dies keine Auswirkungen auf die allgemeine VerfĂŒgbarkeit des Services.

Heutzutage sind die Daten stĂ€rker verteilt als je zuvor, und moderne Anwendungen laufen nicht lĂ€nger in Isolation. Die große Mehrheit der Produkte und Anwendungen basiert auf verteilten Systemen.

How Distributed Systems Work

The most important functions of distributed computing are:

  • Resource sharing - whether it’s the hardware, software or data that can be shared
  • Openness - how open is the software designed to be developed and shared with each other
  • Concurrency - multiple machines can process the same function at the same time
  • Scalability - how do the computing and processing capabilities multiply when extended to many machines
  • Fault tolerance - how easy and quickly can failures in parts of the system be detected and recovered
  • Transparency - how much access does one node have to locate and communicate with other nodes in the system.

Modern distributed systems have evolved to include autonomous processes that might run on the same physical machine, but interact by exchanging messages with each other.

Examples of Distributed Systems


The earliest example of a distributed system happened in the 1970s when ethernet was invented and LAN (local area networks) were created. For the first time computers would be able to send messages to other systems with a local IP address. Peer-to-peer networks evolved and e-mail and then the Internet as we know it continue to be the biggest, ever growing example of distributed systems. As the internet changed from IPv4 to IPv6, distributed systems have evolved from “LAN” based to “Internet” based.

Telecommunication networks

Telephone and cellular networks are also examples of distributed networks. Telephone networks have been around for over a century and it started as an early example of a peer to peer network. Cellular networks are distributed networks with base stations physically distributed in areas called cells. As telephone networks have evolved to VOIP (voice over IP), it continues to grow in complexity as a distributed network.

Distributed Real-time Systems

Many industries use real-time systems that are distributed locally and globally. Airlines use flight control systems, Uber and Lyft use dispatch systems, manufacturing plants use automation control systems, logistics and e-commerce companies use real-time tracking systems.

Parallel Processing

There used to be a distinction between parallel computing and distributed systems. Parallel computing was focused on how to run software on multiple threads or processors that accessed the same data and memory. Distributed systems meant separate machines with their own processors and memory. With the rise of modern operating systems, processors and cloud services these days, distributed computing also encompasses parallel processing.

Distributed artificial intelligence

Distributed Artificial Intelligence is a way to use large scale computing power and parallel processing to learn and process very large data sets using multi-agents.

Distributed Database Systems

A distributed database is a database that is located over multiple servers and/or physical locations. The data can either be replicated or duplicated across systems.

Most popular applications use a distributed database and need to be aware of the homogenous or heterogenous nature of the distributed database system.

A homogenous distributed database means that each system has the same database management system and data model. They are easier to manage and scale performance by adding new nodes and locations.

Heterogenous distributed databases allow for multiple data models, different database management systems. Gateways are used to translate the data between nodes and usually happen as a result of merging applications and systems.

Distributed System Architecture

Distributed systems must have a network that connects all components (machines, hardware, or software) together so they can transfer messages to communicate with each other.

  • That network could be connected with an IP address or use cables or even on a circuit board.
  • The messages passed between machines contain forms of data that the systems want to share like databases, objects, and files.
  • The way the messages are communicated reliably whether it’s sent, received, acknowledged or how a node retries on failure is an important feature of a distributed system.
  • Distributed systems were created out of necessity as services and applications needed to scale and new machines needed to be added and managed. In the design of distributed systems, the major trade-off to consider is complexity vs performance.

To understand this, let’s look at types of distributed architectures, pros, and cons.

Types of Distributed System Architectures:

Distributed applications and processes typically use one of four architecture types below:


In the early days, distributed systems architecture consisted of a server as a shared resource like a printer, database, or a web server. It had multiple clients (for example, users behind computers) that decide when to use the shared resource, how to use and display it, change data, and send it back to the server. Code repositories like git is a good example where the intelligence is placed on the developers committing the changes to the code.

Today, distributed systems architecture has evolved with web applications into:

  • Three-tier: In this architecture, the clients no longer need to be intelligent and can rely on a middle tier to do the processing and decision making. Most of the first web applications fall under this category. The middle tier could be called an agent that receives requests from clients, that could be stateless, processes the data and then forwards it on to the servers.
  • Multi-tier: Enterprise web services first created n-tier or multi-tier systems architectures. This popularized the application servers that contain the business logic and interacts both with the data tiers and presentation tiers.
  • Peer-to-peer: There are no centralized or special machine that does the heavy lifting and intelligent work in this architecture. All the decision making and responsibilities are split up amongst the machines involved and each could take on client or server roles. Blockchain is a good example of this.

Vor- und Nachteile von verteilten Systemen

Vorteile verteilter Systeme:

Das ultimative Ziel eines verteilten Systems ist es, Skalierbarkeit, Leistung und hohe VerfĂŒgbarkeit von Anwendungen zu ermöglichen.Zu den grĂ¶ĂŸten Vorteilen gehören:

  • Unbegrenzte horizontale Skalierung (Neue Maschinen können jederzeit nach Bedarf hinzugefĂŒgt werden).

– Geringe Latenz (Rechner, die geografisch nĂ€her an den Nutzern stehen, ermöglichen eine schnellere Versorgung der Nutzer).– Fehlertoleranz (Wenn ein Server oder Data Center ausfĂ€llt, können andere die Nutzer weiterhin versorgen).– Vorteile von verteilten Systemen:

Disadvantages of Distributed Systems:

Every engineering decision has trade offs. Complexity is the biggest disadvantage of distributed systems. There are more machines, more messages, more data being passed between more parties which leads to issues with:

  • Data Integration & Consistency

being able to synchronize the order of changes to data and states of the application in a distributed system is challenging, especially when there nodes are starting, stopping or failing.

  • Network and Communication Failure

messages may not be delivered to the right nodes or in the incorrect order which lead to a breakdown in communication and functionality.

  • Management Overhead

more intelligence, monitoring, logging, load balancing functions need to be added for visibility into the operation and failures of the distributed systems

How Distributed Streaming Platforms Can Help

Confluent is the only data streaming platform for any cloud, on-prem, or hybrid cloud environment. Connect 120+ data sources with enterprise grade scalability, security, and integrations for real-time visibility across all your distributed systems.

*Free 30-day trial with no credit card required!