Kora Engine, Data Quality Rules and more within our Q2'23 Launch | Register for demo
Is Windows your favorite development environment? Do you want to run Apache Kafka® on Windows? Thanks to the Windows Subsystem for Linux 2 (WSL 2), now you can, and with fewer tears than in the past. Windows still isn’t the recommended platform for running Kafka with production workloads, but for trying out Kafka, it works just fine. Let’s take a look at how it’s done.
The Windows Subsystem for Linux 2 makes it all possible. Microsoft describes WSL 2 as “a GNU/Linux environment—including most command line tools, utilities, and applications—directly on Windows, unmodified, without the overhead of a traditional virtual machine or dual boot setup.”
Make sure you’re running Windows 10, version 1903.18362 or higher. Click Start and navigate to Settings > System > About. In the “Windows specifications” section, find the “OS build.”
If you’re on the Windows Update train, you probably have the latest version and are good to go. If not, you need to update Windows 10.
When you’re sure that Windows is up to date, follow these instructions below to install WSL 2.
Turn on the Windows Subsystem for Linux feature before installing a Linux distribution. Open PowerShell as an administrator, and run the following command:
dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart
This may take a minute or two. Your output should resemble the following:
Deployment Image Servicing and Management tool
Image Version: 10.0.18363.1139
The operation completed successfully.
WSL 2 requires the Virtual Machine Platform feature. In PowerShell, run the following command:
dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart
Download the Linux kernel update package, which is a regular Windows Installer (.msi) file.
Double-click the .msi file to install the WSL 2 update. If you’re prompted for elevated permissions, select Yes to approve the installation.
In PowerShell, run the following command to set WSL 2 as the default version for your Linux distributions:
wsl --set-default-version 2
Your output should resemble the following:
For information on key differences with WSL 2 please visit https://aka.ms/wsl2
WSL 2 is ready to use. For more information on installing WSL 2, including troubleshooting, see Windows Subsystem for Linux Installation Guide for Windows 10.
Install Linux from the Microsoft Store, the same way you install other applications on Windows.
Open the Microsoft Store app and search for “Linux.”
This blog post uses Ubuntu 20.04. Select Ubuntu 20.04 LTS and click Install.
When the installation is complete, click Launch. The shell opens and displays the following message:
Installing, this may take a few minutes... Please create a default UNIX user account. The username does not need to match your Windows username. For more information visit: https://aka.ms/wslusers Enter new UNIX username:
Enter a username and password to complete the installation.
Run the package manager to get the latest updates. In the Ubuntu shell window that opened above, run the following commands:
sudo apt-get update && sudo apt-get upgrade -y
|ℹ️||Tip: Right-click pastes text into the terminal window.|
Kafka requires the Java runtime to be version 8 or higher. Check the Java version in your Linux installation:
Your output should resemble this:
openjdk version "1.8.0_265" OpenJDK Runtime Environment (build 1.8.0_265-8u265-b01-0ubuntu2~20.04-b01) OpenJDK 64-Bit Server VM (build 25.265-b01, mixed mode)
If Java isn’t installed (likely) or it’s not the right version, install it by using your distribution’s package manager. There are a lot of ways to install Java. On Ubuntu, this is one of the simplest:
sudo apt install openjdk-8-jdk -y
You can install Kafka by using a package manager, or you can download the tarball and extract it to your local machine directly.
Select a mirror from the Kafka download site and download the tarball. The following command downloads Apache Kafka version 2.6:
Run the following commands to untar the Kafka archive, and
cd to the
tar -xzf kafka_2.13-2.6.0.tgz cd kafka_2.13-2.6.0
ls -al command to list the contents of the
total 64 drwxr-xr-x 7 jim jim 4096 Oct 14 12:27 ./ drwxr-xr-x 25 jim jim 4096 Nov 20 12:52 ../ -rw-r--r-- 1 jim jim 29975 Jul 28 11:16 LICENSE -rw-r--r-- 1 jim jim 337 Jul 28 11:16 NOTICE drwxr-xr-x 3 jim jim 4096 Jul 28 11:23 bin/ drwxr-xr-x 2 jim jim 4096 Jul 28 11:23 config/ drwxr-xr-x 2 jim jim 4096 Oct 14 12:26 libs/ drwxr-xr-x 2 jim jim 4096 Oct 14 12:28 logs/ drwxr-xr-x 2 jim jim 4096 Jul 28 11:23 site-docs/
Run the following command to start ZooKeeper:
There will be a lot of output, and ZooKeeper will be ready in a short time, typically around a second or two.
Open another terminal session. Change the directory to the
kafka directory, and start the Kafka broker:
cd kafka_2.13-2.6.0 bin/kafka-server-start.sh config/server.properties
If you arrange the windows to be side by side, your output should resemble the following screenshot:
Open another terminal session and run the
kafka-topics command to create a Kafka topic named
cd kafka_2.13-2.6.0 bin/kafka-topics.sh --create --topic quickstart-events --bootstrap-server localhost:9092
Your output should resemble this:
first event second event third event
Arrange the producer and consumer terminal windows to be side by side. In the producer terminal, type a few more messages, and watch as they appear in the consumer terminal.
When you’re done experimenting with Kafka, follow these steps to exit the Kafka environment:
rm -rf /tmp/kafka-logs /tmp/zookeeper
There are lots of Kafka-on-Windows tutorials, but most make the mistake of running Kafka directly on the JVM on Windows. Superficially, this appears to work, but there are limitations: Kafka uses specific features of POSIX to achieve high performance, so emulations—which happen on WSL 1—are insufficient. For example, the broker will crash when it rolls a segment file. Always run Kafka on Windows in a Linux environment backed by WSL 2.
Another approach that works well is to run Kafka in Docker containers. Docker Desktop for Windows has been updated to use the WSL 2 back end, so Docker works exactly as it does on native Linux, without needing to spin up an entire VM. If you want to give this approach a go, try it out using the Confluent Platform demo.
Although Kafka provides an event streaming platform to build your applications on, you’ll want to take advantage of the broader ecosystem of components—like ksqlDB, Confluent Schema Registry, and Confluent Control Center—all provided as part of Confluent Platform.
Now that you have Kafka installed, you’ll want to learn more about it, try out the numerous tutorials, and join in the community! Don’t forget that Apache Kafka has many APIs—including the producer and consumer but also Kafka Streams and Kafka Connect.
You may recall a time when Linux was anathema to Microsoft. Back in 2001, Microsoft CEO Steve Ballmer famously called Linux a “malignant cancer,” but he has since come around to “loving” it. Microsoft’s current CEO Satya Nadella seems intent on making it a first-class citizen in the Microsoft ecosystem, which means that a new era has arrived for software developers on the Windows platform.
When the Windows Subsystem for Linux (WSL 1) was released in 2016, it became possible to run a real Linux dev environment in a Linux shell, while retaining the familiar Windows UX around the shell. Even File Explorer was integrated nicely with the Linux file system.
The big drawbacks are that WSL 1 emulates a Linux kernel, and it runs in a full VM. The first means processes that require a native kernel, like Docker, can’t run. The second means that WSL 1 consumes a lot of resources. WSL 1 was not sufficient to run Kafka reliably.
But Microsoft delivered WSL 2 in 2019, and it’s a whole new world. They fixed the two biggest limitations, so WSL 2 runs a real Linux kernel, and the kernel runs on a subset of Hyper-V features, not in a full VM. For details, see Comparing WSL 1 and WSL 2. Now the path is clear for devs to build Kafka and ksqlDB apps on Windows.
Companies are looking to optimize cloud and tech spend, and being incredibly thoughtful about which priorities get assigned precious engineering and operations resources. “Build vs. Buy” is being taken seriously again. And if we’re honest, this probably makes sense. There is a lot to optimize.
Operating Kafka at scale can consume your cloud spend and engineering time. And operating everyday tasks like scaling or deploying new clusters can be complex and require dedicated engineers. This post focuses on how Confluent Cloud is 1) Resource Efficient, 2) Fully Managed, and 3) Complete.