Hands-on Workshop: ZooKeeper to KRaft Without the Hassle | Secure Your Spot
Command Query Responsibility Segregation (CQRS) is a design pattern that separates the concerns of writing data (commands) from reading data (queries). The application is separated into two distinct data models that can be independently designed and optimized to provide the best performance for both reads and writes. It is often combined with event sourcing—where the commands are translated into stored events, which then asynchronously feed optimized tables for querying.
CQRS is often used in mission-critical systems where performance and scalability are essential. Because the command side is often implemented as a log of events, it can be extremely valuable in industries that are subject to auditing requirements.
That’s why many companies in financial services, the public sector, and other highly regulated industries have adopted Apache Kafka® and Confluent to power their organizations using event-driven communication and real-time data integration. And on Confluent Cloud, developers can build applications that transmit events between any system, in any programming language, all on a single platform.
When building a data model for a system, it is usually necessary to balance the needs of both reads and writes. Most applications tend to be read-heavy so the majority of the traffic will come in the form of reads. However, optimizing for those reads often has a detrimental effect on the performance of the writes.
In a banking system, it is common to store data as a series of transactions, essentially deposits and withdrawals. However, users tend to be more interested in the calculated balance of the account, rather than the individual transactions. When a user goes to the bank to deposit money, typically they want to see the current balance of the account after they make the deposit. They don’t really care about all of the transactions that came before. There are exceptions of course, so those transactions are important, but for the general day-to-day banking it’s the balance that is interesting.
On the withdrawal side, whether using a bank machine, a debit machine, or even a teller, the most important thing is whether the account has the funds available to cover the transaction. Here, again, the balance is what matters, not the individual transactions.
For scenarios where the balance is needed, the ideal situation is if it is stored as a row in a table somewhere. For example, a table containing the account ID and the current balance. This would make it a very quick lookup.
However, when performing the actual transaction, the details need to be written into the database including the date, a transaction ID, the amount, etc. The ideal way to do this would be to have a transactions table that contains a row for each transaction.
This creates a conflict. When reading data, the calculated balance is usually the desired format. When writing data, it’s better stored as individual transactions.
Now, there are ways to accommodate both. They could both be stored in separate tables but updated in a single database transaction. This would ensure they were accurate and allow both tables to be written in the ideal way. However, transactions aren’t free. They create contention in the database which reduces the performance and scalability. We’ve made a compromise, essentially slowing down and complicating the writes to provide faster reads.
Alternatively, the transactions could be stored without the balance. When the balance is requested, a sum can be calculated to get the result. However, this is slower than just doing a lookup in an indexed table. It compromises the speed of the reads in to accommodate the writes.
Of course, there are other possible implementations, but each implementation is making a tradeoff. It either complicates the writes to accommodate the reads, or it complicates the reads to accommodate the writes.
Command-Query Responsibility Segregation (CQRS) is designed to allow both the reads and writes to exist in their ideal format, without compromising either.
CQRS divides the system into two types of operations, commands and queries.
When a user of the system wants to modify its state, they issue a command. The commands are often processed immediately, but they can also be queued for later processing. In the banking example, these commands might be something like “DepositMoney” or “PayBill”.
Queries, on the other hand, never modify the state. They are issued to access the existing state. This could be something like “GetMostRecentTransaction” or “GetCurrentBalance”.
In CQRS, the system can be divided into separate data models for both commands and queries. Commands operate on something known as the write model, while queries operate on the read model. It is common to have many different read models, each looking at the data from a different perspective. This allows the system to fulfill a variety of different queries.
In the banking example, the write model would consist of a series of transactions. These transactions take the form of commands issued to the model and result in a new row being added to the transactions table. There could be a read model that would include a table of account balances. It might include another read model recording the most recent transaction.
This doesn’t necessarily solve the original problem. Each model could be stored in a separate table and they could all be updated as part of a single database transaction, but that complicates the writes and creates additional database contention.
Typically, in a CQRS system, the contention is eliminated using asynchronous updates of the read models.
The most common way to perform an asynchronous update of the read models is through the use of events and event sourcing.
It turns out that for the vast majority of systems, the most efficient way to write data is in the form of an event log. As each command enters the system, it is translated into an event. The “PayBill” command can be translated into a “BillPaid” event (or something more generic such as “MoneyWithdrawn” or “MoneyTransfered”). This event contains the details of the command and can be written to the database in an append-only fashion. Append-only writes are very fast and efficient in most databases, requiring a minimal amount of locking or transactional logic. As long as the insert only touches a single row in a single table, there’s very little to prevent the system from writing as fast as possible.
The technique of recording changes as a series of events in a log is known as event sourcing, and while it’s not strictly required for doing CQRS, it is quite common to see the two of them used together.
To create the read models, a separate process (or series of processes) can consume the event log and perform asynchronous processing, updating the read models as required. When the banking transactions are written to the transaction log, a separate process can read those transactions and use them to calculate a balance. The balance can then be written to a separate table. This is known as a projection. Each read model is a projection of the data that was written into the write model.
Read models can be fully optimized for each individual query. Often, this involves a significant amount of denormalization of the data. The intent is to get the data into a form that is as close to what is needed to fulfill the query as possible. Ideally, all of the data could be read from a single row in a single table, avoiding expensive joins or complex queries.
In fact, it isn’t even necessary to use the same database for the read models. Each model can be placed into a database that is ideal for its individual needs, or the data could even just be stored in memory. This makes the reads as fast as possible.
CQRS provides several key benefits to the system.
Because both the read and write models can be independently optimized, they can be used in the most efficient way possible. This can include reducing the number of table joins, duplicating data, and even relying on alternative databases if required.
CQRS reduces the amount of contention in the database. This can help with scalability. However, it also opens up some new options for scaling. The separation of reads and writes allows for each piece to be deployed and scaled independently. It is possible to create separate microservices to support each of the models if required. Furthermore, even the databases can be separated if necessary.
When using CQRS, each of the individual models tends to be relatively simple and independent. The write model doesn’t need to know anything about the read models, and the read models can be independent from each other. This isolation allows the models to be updated fairly easily, without impacting the rest of the system.
CQRS enables different storage and optimization strategies for reads and writes. The write side could use a NoSQL database, while the read side could use a more traditional RDBMS or in-memory datastore. They can be deployed as a single unit or as individual services. They can even be written in different programming languages if that provides a benefit.
CQRS allows the design of the system to focus more on the business logic and behavior, rather than being encumbered by data access concerns.
While event sourcing and CQRS are not top-level patterns, they are not intended to be used to build an entire business. Instead, these architectural patterns should be isolated to the places that matter most to provide concrete business value. But if these patterns are valuable, why shouldn’t they be used everywhere? The answer is that there are numerous challenges with CQRS, and those challenges might outweigh the benefits in some cases.
A system based on CQRS is arguably more complicated. Rather than a single model, it has multiple models that have to be dealt with. However, each individual model is simpler than a traditional relational model, as are the queries associated with them. Essentially, complexity has been moved from the database into the application. For those used to dealing with relational databases, the transfer of complexity can be difficult to adapt to.
CQRS introduces asynchronous updates between the read and write model. This eventual consistency creates challenges that developers may not be used to dealing with. Every business domain has asynchronous processes built into them. In many traditional systems, the eventually consistent aspects are ignored, patching over them with transactions and strong consistency. CQRS makes the asynchronous aspects explicit so they aren’t as big of a surprise. However, it takes some getting used to.
CQRS allows a lot of flexibility in how systems are built. Models can be deployed as separate services, they can target different databases, and can even use different programming languages. This introduces additional operational overhead. It is important to note that just because this additional flexibility is available, doesn’t mean it should be used. It is usually safer to start with a basic CQRS system and only separate pieces as the business requires it.
CQRS will be unfamiliar to most developers. This means there is a learning curve associated with its use. Make sure the development team understands the concepts and is ready to adopt them before using it.
Whether CQRS is worth it depends on what you need. For example, CQRS is great for a stock trading platform where performance, scalability, and regulatory compliance are paramount.
Here are some factors to consider about when to use it.
CQRS is ideal for systems requiring high throughput and low latency, particularly when read and write operations have divergent performance requirements.
If the business domain involves intricate logic that significantly differs between read and write operations, CQRS can simplify the model.
Systems that require detailed audit trails and compliance with regulatory standards benefit from CQRS due to the clear separation and logging of state changes. In financial services, maintaining a detailed event log of transactions supports auditability and regulatory compliance.
While there are many benefits to CQRS, there are also situations when it doesn’t make sense. For example, CQRS is not a good fit for a small blogging site with moderate traffic and straightforward create, read, update, and delete (CRUD) operations.
Here are some other indications that CQRS is not the right fit.
For simpler applications with straightforward business logic, the added complexity of CQRS may not be justified. A small e-commerce platform with basic inventory and order management might find a monolithic architecture more maintainable.
Applications that do not experience high traffic or where scalability is not critical may not benefit from CQRS. An internal company portal with limited users and low data change frequency might not need CQRS.
If the operational burden and maintenance cost are prohibitive, it may be prudent to stick with traditional approaches. Start-ups or small teams with limited resources might prefer a simpler architecture until scaling becomes necessary.
Still need help deciding? Use this table below.
Criteria | High Justification | Low Justification |
---|---|---|
Performance and Scalability | High transaction volume, low latency needed (example: stock trading platform) | Moderate traffic, simple performance needs (example: small blogging site) |
Domain Complexity | Complex read/write logic (example: healthcare system) | Simple and uniform business logic (example: basic to-do list application) |
Audit and Compliance Needs | Strict audit and compliance requirements (example: financial services platform) | Minimal audit requirements (example: internal company portal) |
Operational Complexity Tolerance | Team capable of managing complexity (example: large enterprise team) | Team prefers simplicity (example: startup with small team) |
Development Team Scalability | Large team with resources (example: enterprise-level project) | Small team (example: small project) |
Many CQRS-based systems use asynchronous events to update the read models. These read models might live in the same microservice, but they often don’t. Confluent provides tools to transmit events between multiple systems written in different programming languages. This includes tools to manage event streams and ensure they remain compatible throughout the system.