Issue #2 of Confluent Chronicles: The Force of Kafka + Flink Awakens | Read Now
Get Apache Kafka and Flink news delivered to your inbox biweekly or read the latest editions on Confluent Developer!
Confluent has published official Docker containers for many years. They are the basis for deploying a cluster in Kubernetes using Confluent for Kubernetes (CFK), and one of the underpinning technologies behind Confluent Cloud.
For testing, containers are convenient for quickly spinning up a local cluster with all the components required, such as Confluent Schema Registry or Confluent Control Center. You configure each container via a set of environment variables, using the internal Docker network and hostname resolution to reference the various components such as the Apache Kafka® broker. This can be done using docker-compose, which will create a network for your cluster and allow you to keep all configurations in a single document.
This is easily done if you have a single ZooKeeper or KRaft controller and a single Kafka broker in your cluster.
The problem with a Docker compose file begins when you try to configure a larger cluster, for example, for failover testing. Suppose you want to convince yourself (or demonstrate) how the Kafka cluster gracefully handles the loss of a broker. In that case, you need to configure multiple Kafka brokers in your docker-compose.yml file.
You need to be careful not to reuse the same hostname or external port number, and when referencing, for instance, your bootstrap servers, you need to be sure to use the correct hostnames and ports. If you copy/paste and edit your entries, you may mistype or forget an entry. The cluster might still come up but will be degraded.
Of course, you can create a template for a three ZooKeeper/three broker cluster and reuse that, but what if you want to test four or six brokers instead?
This is the situation I found myself in a few years ago with a customer who wanted to test out various configurations. It became very tedious to reconfigure the various docker-compose files for the different scenarios. So, I wrote a Python script that generates the docker-compose file for me: kafka-docker-composer.
The tool requires a working Python 3 environment (I have tested with Python 3.8 - 3.11) with jinja2 installed, which comes automatically in many distributions. You can verify this with the following command:
If this comes up with a ModuleNotFoundError
, install jinja2 with pip3, for example:
You also will need to have Docker and docker-compose installed, of course.
Then clone the repository and you are ready to start creating docker-compose files.
The templating engine Jinja2 allows me to create a template document with variables in it I can populate through my application.
This template simply loops through all configured services and populates the required field in the final docker-compose.yml file for each container I want to run. Each container encapsulates a service such as a Confluent Server or a ZooKeeper instance, a Schema Registry, and so on.
For each entry, I define the container image to be run as well as the host and container name. Each instance also has a whole set of optional parameters that are defined in the application:
Health checks so I can define dependencies that will wait for the prerequisites to be healthy.
Dependencies, which define the order in which containers are created. If health checks are defined as a dependency, this will also be listed here.
The environment variables that are used for configuring the service within the container.
Ports that are mapped to the docker host.
Volumes that are mounted to inject additional files, such as Kafka Connect Connector plugins or metrics configurations for monitoring using Prometheus.
The application kafka_docker_composer.py takes a list of arguments and calculates how the template should be populated. It creates the dependencies between the different components, ensures that names and ports are unique, the advertised listeners are correctly set up, and that dependent services like Schema Registry of Kafka Connect point to the corresponding Confluent Server brokers.
There are many different configurations, such as using ZooKeeper or KRaft, all controlled by a set of arguments or a configuration file. Here is an overview:
This is best explained through the following use cases.
We just created a docker-compose file for the simplest case: One ZooKeeper and one Broker using the latest Docker image configuration reference for Confluent Platform; currently, this is 7.6.0.
Not convinced? Try it out.
You might have to use “docker-compose” instead of “docker compose” on your platform or upgrade your Docker environment to use Compose V2.
The broker ports start with 9091 for the first broker, use kafka-topics to create and list topics:
After you are done with the cluster, shut it down again with the following:
The -v
option removes the volumes as well, avoiding the potential problem of reusing stale data.
ZooKeeper is deprecated; therefore, modern versions of Kafka prefer KRaft. Just change the option from zookeeper
to controller
:
Look inside the generated docker-compose.yml file to see which environment variables you must set to create a controller-broker pair successfully. Do you really want to set this by hand?
Creating single broker setups is nice, but more is needed to show the power of this tool. A minimum standard cluster consists of three controllers and three brokers:
Note that each broker has its own externally visible port mapped to the host so that you can access each broker individually. What about the other ports? They are for the JMX agents if you want to configure Prometheus and Grafana for your cluster:
As you can see, Prometheus is exposed on port 9090 and Grafana on port 3000. Try out Grafana by pointing your browser to http://localhost:3000. The user and password are set to “admin/adminpass”, but you can adjust that in volumes/config.ini. There are separate dashboards for ZooKeeper and KRaft controllers as indicated by their names.
The exporter configuration files and dashboards are in the volumes directory, as is the exporter jar, so you do not have to download anything separately.
Remember that it takes a few minutes for JMX exporters to start up; check the Status/Targets page in Prometheus to see if your metrics scrapes succeeded.
In addition to the brokers and controllers, you can add Confluent Schema Registries, Kafka Connect worker nodes, ksqlDB nodes, and Confluent Control Center to the mix. You will probably increase the memory of your Docker environment, specifically, if you run this on your notebook with Docker Desktop.
My Docker Desktop is configured with 8 cores and 16 GB of memory, which gives me ample room to run a large cluster in Docker compose.
The health checks are built for this purpose. If, for example, Schema Registry starts up before the broker finishes booting, it will fail since it cannot create its topic, and it will not try again. Verifying the brokers are up and running and ready to receive clients ensures that the dependent components do not fail on startup.
One specific component is the Connect cluster. This cluster comes with a bare-bone set of connector plugins installed. Still, it turned out that by mapping a volume containing unzipped connector plugin jar files before starting up the cluster, I could easily use the same setup to test various connectors cheaply. I have added a few connector plugins as examples, such as the Datagen Connector, which is useful for acting as a producer for testing without external dependencies.
I have added an example plugin configuration here. Bring up a cluster with:
This will list all installed connector plugins. You might have to install jq, a handy JSON formatting and filtering tool.
Create a target topic, then install the Datagen connector:
Other connector plugins can be installed by downloading the zip file from the connector hub, for example, the Elasticsearch Sink connector. Unzip the file into the volumes/connect-plugin-jars directory and restart your connect clusters:
The Kafka Connect clusters take a while to start up, so you need to be a little patient. You can use the following to monitor progress:
The original purpose of the kafka-docker-composer was to test failover scenarios in multiple data centers. For this purpose, there are two additional options: racks and ZooKeeper groups.
Specify the number of racks for the number of unique racks or data centers you want to test with. A typical example is to choose three racks, which is a common setup for production clusters. The configured brokers will then be assigned round-robin to the racks. This is particularly useful if you configure more brokers than the number of racks to test the distribution of partitions across the different racks.
Have a look at the generated docker-compose.yml file. You will notice that both the controllers and the brokers have a new environment variable, for example:
Create a new topic called products, but add the option --partitions 6
, then run:
If you go through every single partition and compare their placement on the individual replicas, you will notice that each replica is in a different rack. Even in the case of a loss of one rack each partition will still have two replicas online. Since min.insync.replicas
is set to 2, producers and consumers will still be able to work.
You can then test what happens if a data center goes down by using standard docker-compose methods to kill the containers. Attach a producer and consumer to the cluster to convince yourself that the cluster is still accessible and capable of processing requests. This is how I demonstrate to colleagues, partners, and customers the resilience features of Apache Kafka.
In our example, we have the following distribution:
This is because the tool assigned the rack ID round-robin.
We need to produce and consume some data. Start with the consumer readily waiting for some messages:
Note that I am specifying three brokers across all three racks here. For this test, it is not strictly necessary because we will receive the full list of all brokers upon connection anyway, but it is good practice in case a broker or a whole rack (data center) is down while we are starting our application.
Then we start the producer:
The tool seq generates a sequence of numbers separated by a newline character. I added the formatting to avoid presenting the numbers in scientific notation. The result is a new message for each number, nicely sequenced for easy verification. Note that there is no guarantee that the consumer will return the numbers in the same order since we created the topic with six partitions.
On my Mac, the producer will work through this sequence quite quickly, so we need to hurry with the next step. To simulate an outage of one data center, we need to kill all containers in one rack. Let’s pick rack-2:
Note that the consumer will stumble for a moment while the active controller sorts out the leadership for each partition, but it will then pick up again. When the consumer no longer finds any new messages it will wait and we can shut it down. It should print out the total number of messages: 9000001.
You can vary the sequence to test out the loss of a rack for producers as well. You should notice a lot of error messages when the producer complains about brokers not being reachable anymore, but it will sort itself out after a while. This is also a good example for the idempotent producer since no messages will be duplicated during this time, as you will be able to verify with your consumer.
Bring the cluster back up again with a simple:
Depending on how long the producer was working while the cluster was in a degraded state, this last command might take a while to finish, because the brokers have to catch up first before they respond to the health-check command.
The other option is for configuring ZooKeeper groups, specifically for a two-data center scenario with hierarchical groups, for which you will need six ZooKeeper instances at a minimum. The tool will calculate the distribution and set up the docker-compose file accordingly.
If you look at the generated docker-compose.yml file, you will find these lines:
These are the generated groups, with ZooKeeper instances distributed between them. The cluster will come up just fine with these settings, but when you shut one (simulated) data center down, you will see why a two-data center solution is always inferior to a three-data center setup.
The ZooKeeper instances do not automatically failover to the degraded state of three ZooKeepers. Instead, they will refuse to accept all connections until the situation is resolved. This also means brokers will refuse to acknowledge producers and even consumers will fail because they need to update the ZooKeeper about their progress.
You can resolve the situation by starting the second data center up again, in which case the cluster will recover. If you want to continue in the degraded state, you need to manually remove the groups from the configuration files and restart the remaining ZooKeeper instances.
The required procedure goes beyond the scope of this post. In production environments, we advise administrators to keep a second configuration file handy that can be swapped in after the ZooKeeper instances have been shut down. This is not something you can and want to automate, but you can script it and invoke the failover script manually should the need arise. The procedure should be well documented and tested in a non-production environment to ensure that administrators know during the panic of an outage what to do. Note that this is not possible without downtime (RTO > 0).
Looking through the help list, you might notice a few other arguments that have not been discussed.
This is an experimental feature for upgrading KRaft controllers to full brokers. This is not officially supported for production environments, but if you want to play with it, you can use the --shared-mode
argument. If nothing else, it will show you which additional environment variables you need for this setup.
When you start up this cluster, you will see that you have six active brokers rather than the original three since the controllers act as brokers as well. If you have kcat (kafkacat) installed, you can use kcat -L -b localhost:9094
to verify this.
A cluster running with KRaft needs a UUID to identify membership for controllers and brokers. I have created and hardcoded such a UUID, but if you want to change this value, use the --uuid
option.
I have built the infrastructure for it but have not done much testing with it. This option enables a build for all components to create a new image that contains the tc tool that can be used to inject latency. There is a Medium article that explains a little more about this feature if you are interested, but most of the ideas for this tool come from the Confluent tutorial on multi-region clusters.
Building the Docker images with tc enabled involves multiple steps:
Adjust the .env file in the root directory to the release you want to base your test on. Apologies for the repetition: these are straight from the underlying source and have a lot of redundancies. I have just updated this to 7.6.0.
Run the build script script/build_docker_images.sh
. This will download the configured base images and build new images stored in your local image cache.
Run kafka-docker-composer with the --with-tc
option to create your docker-compose file. Ensure the version matches the version you have just created. Use the --release
option if necessary.
You now have a configuration with Docker images that have been enhanced with the tc utility.
To start this up and test it, do the following:
Bring the cluster up as before
Execute a ping within a container against another instance to verify the normal latency
Enable latency injection in each node
Run ping again to verify the effect
Perform your tests with a multi-region simulator
Here is a simple example:
The option -u0
gives you root access to execute commands with elevated permissions within the container.
You should now observe a difference in the ping round time of around 200ms, with 100ms latency injected from each side.
This is by no means a complete list of all features present since I am still updating this tool to suit my requirements for testing and understanding. For example, I have exposed the JMX and HTTP ports to the host to be able to use Visual VM to understand more of the metrics and to understand the REST interface to the Confluent server.
Check the GitHub repository for the latest changes. You can set up a watcher if you want to receive updates via email. Why not add a star to the repository while you are there?
I have yet to add any authentication or authorization features to this tool, mostly because I have other tools to test and demonstrate security. Still, it might be a worthwhile project for SASL/PLAIN or even SASL/SCRAM. TLS certificates are a bit trickier because I’d need a new image to generate these. I want the whole project to start with a single docker compose up, not a script.
The same is true for Kerberos and LDAP for RBAC, which would require a Samba service in a separate container and some configuration. Let me know in the comments or file an issue on GitHub.
Apache Kafka 3.7 is now available and comes with an official Docker image as of KIP-975.
I have successfully tested the image separately from the kafka-docker-composer and I think it might be useful to enable the swapping out of the Confluent Server (cp-server) image to test out the new features in the next Kafka release—since these releases are published typically three to six months before the corresponding Confluent release. The main difficulty will be that open-source Apache Kafka has no built-in REST interface, making health checks more challenging.
I have been using and extending kafka-docker-composer for the last five years to demonstrate how to set up a cluster in Docker. The main purpose of this tool was to show how resilient a Confluent cluster is even during a large outage.
Lately, I have used the same tool to teach myself KRaft, experiment with it, and use the setup for connector testing and development.
I'd like to know what you will use this tool for. You can let me know by commenting on the GitHub repository.
Happy hacking!
Get Apache Kafka and Flink news delivered to your inbox biweekly or read the latest editions on Confluent Developer!
Learn how to track events in a large codebase, GitHub in this example, using Apache Kafka and Kafka Streams.
Apache Kafka 3.7 introduces updates to the Consumer rebalance protocol, an official Apache Kafka Docker image, JBOD support in Kraft-based clusters, and more!