Build Predictive Machine Learning with Flink | Workshop on Dec 18 | Register Now
On behalf of the Apache Kafka® community, it is my pleasure to announce the release of Apache Kafka 2.5.0. The community has created another exciting release.
We are making progress on KIP-500 and have added new metrics and security features, among other improvements. This blog post goes into detail on some of the added functionality, but to get a full list of what’s new in this release, please see the release notes.
In Apache Kafka 2.5, some preparatory work has been done towards the removal of Apache ZooKeeper™ (ZK).
This KIP simplifies the API for applications that read from and write to Kafka transactionally. Previously, this use case typically required separate producer instances for each input partition, but now there is no special requirement. This makes it much easier to build EOS applications that consume large numbers of partitions. This is foundational for a similar improvement in Kafka Streams in the next release.
See KIP-447 for more details.
This KIP addresses a problem with producer state retention on the broker, which is what makes the idempotence guarantee possible. Previously, when the log was truncated to enforce retention or truncated from a call to delete records, the broker dropped producer state, which led to UnknownProducerId errors. With this improvement, the broker instead retains producer state until expiration. This KIP also gives the producer a powerful way to recover from unexpected errors.
See KIP-360 for more details.
Apache Kafka 2.5 now ships ZooKeeper 3.5.7. One feature of note is the newly added ZooKeeper TLS support in ZooKeeper 3.5. When deploying a secure Kafka cluster, it’s critical to use TLS to encrypt communication in transit. Apache Kafka 2.4 already ships with ZooKeeper 3.5, which adds TLS support between the broker and ZooKeeper. However, configuration information has to be passed via system properties as -D command line options on the Java invocation of the broker or CLI tool (e.g., zookeeper-security-migration), which is not secure. KIP-515 introduces the necessary changes to enable the use of secure configuration values for using TLS with ZooKeeper.
ZooKeeper 3.5.7 supports both mutual TLS authentication via its ssl.clientAuth=required configuration value and TLS encryption without client certificate authentication via ssl.clientAuth=none.
See KIP-515 for more details.
Previously, operators of Apache Kafka could only identify incoming clients using the clientId field set on the consumer and producer. As this field is typically used to identify different applications, it leaves a gap in operational insight regarding client software libraries and versions. KIP-511 introduces two new fields to the ApiVersionsRequest RPC: ClientSoftwareName and ClientSoftwareVersion.
These fields are captured by the broker and reported through a new set of metrics. The metric MBean pattern is:
kafka.server:clientSoftwareName=(client-software-name),clientSoftwareVersion=(client-software-version),listener=(listener),networkProcessor=(processor-index),type=(type)
For example, the Apache Kafka 2.4 Java client produces the following MBean on the broker:
kafka.server:clientSoftwareName=apache-kafka-java,clientSoftwareVersion=2.4.0,listener=PLAINTEXT,networkProcessor=1,type=socket-server-metrics
See KIP-511 for more details.
This KIP identifies and improves several parts of our protocol, which were not fully self-describing. Some of our APIs have generic bytes fields, which have implicit encoding. Additional context is needed to properly decode these fields. This KIP addresses this problem by adding the necessary context to the API so L7 proxies can fully decode our protocols.
See KIP-559 for more details.
Kafka consumers can choose the maximum number of bytes to fetch by setting the client-side configuration fetch.max.bytes. Too high of a value may degrade performance on the broker for other consumers. If the value is extremely high, the client request may time out. KIP-541 centralizes this configuration with a broker setting that puts an upper limit on the maximum number of bytes that the client can choose to fetch.
See KIP-541 for more details.
During runtime, it’s not easy to know the topics a sink connector reads records from when a regex is used for topic selection. It’s also not possible to know which topics a source connector writes to. KIP-558 enables developers, operators, and applications to easily identify topics used by source and sink connectors.
$ curl -s 'http://localhost:8083/connector/a-source-connector/topics' {"a-source-connector":{"topics":["foo","bar","baz"]}}
The topic tracking is enabled by default but can also be disabled with topic.tracking.enable=false.
See KIP-558 for more details.
In the past, aggregating multiple streams into one could be complicated and error prone. It generally requires you to group and aggregate all of the streams into tables, then make multiple outer join calls. The new co-group operator cleans up the syntax of your programs, reduces the number of state store invocations, and overall increases performance.
KTable<K, CG> cogrouped = grouped1 .cogroup(aggregator1) .cogroup(grouped2, aggregator2) .cogroup(grouped3, aggregator3) .aggregate(initializer1, materialized1);
See KIP-150 for more details.
A powerful way to interpret a stream of events is as a changelog and to materialize a table over it. KIP-523 as a toTable() function can be applied to a stream and materializes the latest value per key. It’s important to note that any null values will be interpreted as deletes for a given key (tombstones).
See KIP-523 for more details.
Previously, interactive queries (IQs) against state stores would fail during the time period when there is a rebalance in progress. This degraded the uptime of applications that depend on the ability to query Kafka Streams’ tables of state. KIP-535 gives applications the ability to query any replica of a state store and observe how far each replica is lagging behind the primary.
See KIP-535 and this blog post for more details.
We have dropped support for Scala 2.11 in Apache Kafka 2.5. Scala 2.12 and 2.13 are now the only supported versions.
TLS 1.2 is now the default SSL protocol. TLS 1.0 and 1.1 are still supported.
To learn more about what’s new in Apache Kafka 2.5 and to see all the KIPs included in this release, be sure to check out the release notes and highlights in the video below.
This post was originally published by David Arthur on The Apache Software Foundation blog.
Dive into the inner workings of brokers as they serve data up to a consumer.
We are proud to announce the release of Apache Kafka 3.9.0. This is a major release, the final one in the 3.x line. This will also be the final major release to feature the deprecated Apache ZooKeeper® mode. Starting in 4.0 and later, Kafka will always run without ZooKeeper.