Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now

Getting Started with Apache Kafka and Kubernetes

Written By

Enabling everyone to run Apache Kafka® on Kubernetes is an important part of our mission to put a streaming platform at the heart of every company. This is why we look forward to releasing an implementation of the Kubernetes Operator API for automated provisioning, management, and operations of Kafka on Kubernetes. (Read about that here and let us know if you’d like to take it for a spin when it’s available!)

Today, we are taking a step in this direction by sharing two resources to help you run Confluent Platform on Kubernetes:

  1. Helm Charts that provide a starting point for deployment of Confluent Platform components
  2. A white paper that shares the best practices we’ve learned by running Confluent Platform on Kubernetes

Getting started with deployment templates

Now available on GitHub in developer preview are open source Helm Chart deployment templates for Confluent Platform components. These templates enable developers to quickly provision Apache Kafka, Apache ZooKeeper, Confluent Schema Registry, Confluent REST Proxy, and Kafka Connect on Kubernetes, using official Confluent Platform Docker images.

Helm is an open-source packaging tool that helps you install applications and services on Kubernetes. Helm uses a packaging format called charts. A chart is a collection of YAML templates that describe a related set of Kubernetes resources.

For stateful components like Kafka and ZooKeeper, the Helm Charts use both StatefulSets to provide an identity to each pod in the form of an ordinal index, and Persistent Volumes that are always mounted for the pod. For stateless components, like REST Proxy, the Helm Charts utilize Deployments instead to provide an identity to each pod. Each component’s charts utilize Services to provide access to each pod.

It’s really easy to get started. Just install Helm on your local or deployed Kubernetes cluster, and use the parent chart to configure and deploy Confluent Platform components:

helm install cp-helm-charts

helm.install.6

Helm allows you to customize configurations using YAML templates. For example, to deploy five Kafka brokers and set a custom `min.insync.replicas`, create a copy of the default YAML template, update the values, and use it to install the configuration:

helm install -f custom-values.yaml cp-helm-charts

custom-values.yaml

## Number of Kafka brokers

brokers: 5

## Kafka Server properties

configurationOverrides:

 "log.dirs": /opt/kafka/data/logs

 "offsets.topic.replication.factor": 3

 "min.insync.replicas": 3

Special thanks to Qi Shao, Nikhil Chandrappa, Li Wang, Amey Banarse and Prasad Radhakrishnan from Pivotal’s (NYSE: PVTL) Platform Architecture team for their contributions in developing and testing these Helm Charts.

Sharing best practices

What’s both exciting and challenging about working with the vibrant Kubernetes ecosystem is that everything progresses so fast. To help you understand and plan for running Confluent Platform on Kubernetes, we’ve provided a white paper that shares the best practices we’ve learned over time.

The Recommendations for Deploying Apache Kafka on Kubernetes white paper helps you see how Confluent Platform components fit into the Kubernetes ecosystem. It also covers our approach to networking, storage, traffic, log aggregation, metrics and more.

We will update this white paper as the Kubernetes ecosystem continues to evolve.

Conclusion

Our end goal for this is to make streaming data ubiquitous. Kubernetes lets you run your applications and services anywhere. Kafka enables you to make your data accessible instantaneously, anywhere.

Get started with Kafka on Kubernetes today by checking out the white paper and Helm Charts on our website.

With Confluent Operator, we are productizing years of Kafka experience with Kubernetes expertise to offer you the best way of using Apache Kafka on Kubernetes. Let us know if you are interested and we’ll notify you when it’s available!

  • Rohit Bakhshi is the Director of Product for Confluent's hybrid product lines. He’s spent the last 12 years building data platform software businesses—leading product for Kafka, Spark based ETL, GraphQL, and Hadoop.

Did you like this blog post? Share it now