Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now
Ever dealt with a misbehaving consumer group? Imbalanced broker load? This could be due to your consumer group and partitioning strategy!
Once, on a dark and stormy night, I set myself up for this error. I was creating an application to demonstrate how you can use Apache Kafka® to decouple microservices. The function of my “microservices” was to create latte objects for a restaurant ordering service. It was set up a little like this:
I wanted to implement this in Kafka by using consumers, each reading from a common coffee topic, but with their own partition. Now this was a naive approach. Why?
Well, let’s say it’s pumpkin spice latte season. In the United States, that’s a period of time lasting from August to December during which the pumpkin spice coffee flavor undergoes overwhelming demand at cafes. This would mean the pumpkin spice-flavored coffee order events would dramatically increase.
Since I had each consumer in the same group for one coffee topic, every coffee ‘microservice’ consumed only from a single partition, ignoring Kafka’s mechanism for parallelism. Eventually, I can expect a hot spot in my pumpkin spice partition, which means that some of my brokers would be slow to distribute their pumpkin spice—and others would not be used to their full potential.
How do we solve this issue? Instead of having every consumer in one group, and organizing the different latte flavors by message key, I could send each latte flavor to its own multithreaded topic, and the load for latte orders would then be balanced across partitions, like so:
Now, technically, I could keep all the consumers in the same group, consuming from different topics. But in this case, it’d be best practice to give each consumer its own group. That way, I could scale these topics separately by adding more consumers to a specific group.
Let’s step back a little bit and consider consumer group strategy in general.
One of the more foundational concepts of consumer groups is that each consumer, once assigned to a group, shares the workload. You can read more about this in our blog post about configuring consumer Group IDs, but basically, when each consumer has the same Group ID, they cannot read from the same partition:
Now, you don’t always need to reach for a single consumer group, like I had done at first in my latte project… think through your use case! Sometimes, you may have two consumers (each in different consumer groups, allowing for parallel processing) for two different services, like a customer address service and a customer delivery notification service, that would need to read from the same partition in the same topic. These two consumers in different groups reading from the same topic can pick up reading from different offsets, which they would not be able to do if Kafka used a queue rather than a persistent log.
However, if you do want your consumers to read from the same topic, you need to consider a couple of things: 1) the number of consumers in your group and 2) the number of partitions. This consideration is important because the consumers will share the workload as equally as possible among themselves. Sometimes making this decision is easy—I’m writing a demo app in Kafka Streams with 1 consumer and 1 partition right now because I just need to show developers how to use a .process()
function in Kafka Streams. But in most use cases, this decision takes careful consideration.
Considerations in favor of fewer consumers in relation to a high number of partitions include things like how high you want your consumer throughput to be since each partition is given one thread per consumer. You might also want room to increase parallelism later on.
Considerations against having fewer consumers to a higher number of partitions include things like higher unavailability in the case of unclean failure since leader election will take longer.
If you’re interested in reading more about consumer strategy in relation to partitions, Jun Rao has an excellent blog post on the subject.
Let’s take a look at what consumer group configuration looks like in JS, Java, and Python.
Here’s the example of consumer group configuration in KafkaJS that we saw above:
In this example, the group.id
is set when the function that creates the consumer is called.
In the Python Kafka client, you might do something like this in a config.properties
file:
Then you could import those properties whenever you’re creating a consumer.
Now, if you’re using Java to create a Kafka Streams app, it gets a little trickier. That’s because the Group ID configuration on the consumer is not called group.id
or something similar that you might expect. Instead, the APPLICATION_ID_CONFIG
is what Kafka Streams ends up using as the consumer Group ID.
I’m hoping this blog post gave you something to think about when it comes to consumer group strategy, whether you’re new to configuring consumer groups or designing a large piece of architecture. If you’d like to learn more, here are some resources I recommend:
Kafka Consumer Groups Tool: Documentation on using Kafka’s consumer groups tool to check metadata on groups
Configuring Kafka Consumer Group IDs: Learn more about consumer Group IDs, specifically
The Kafka Consumer Group Protocol is about to get simpler—read KIP-848 for the details
Learn what a Kafka consumer group ID is and how assigning one to Kafka consumers during configuration helps with detecting new data, work sharing, and data recovery.
Dive into the inner workings of brokers as they serve data up to a consumer.