SECURITY

5 min read

by

Gabor Kozma

Published on 03/11/2018

Last updated on 02/03/2025

Published on 03/11/2018

Last updated on 02/03/2025

Monitoring Apache Kafka With Prometheus

Subscribe to

The Shift!

Get emerging insights on innovative technology straight to your inbox.

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded Kafka on Kubernetes the easy way

At Banzai Cloud we provision and monitor large Kubernetes clusters deployed to multiple cloud/hybrid environments, using Prometheus. The clusters, applications or frameworks are all managed by our next generation PaaS, Pipeline. One of the most popular frameworks we deploy to Kubernetes at scale, and one that we love, is Apache Kafka. We have centralized the monitoring of multiple large Kafka clusters with federated Prometheus on Kubernetes. This post is about getting into the nitty-gritty of your available options, and exploring some examples of monitoring solutions.

Note that we have removed Zookeeper from Kafka and we use Etcd instead. For further details please read this post.

It’s quite common for Java applications to monitor the JVM itself, since applications do not have a built-in monitoring component. The simplest way to collect information is through JMX, where metrics about the state of the JVM, CPU, Memory, and GC are already available. Based on these metrics, we can optimize settings for memory usage, threads, or even by using exposed setters through MBeans.

Using existing tools

Apache Kafka deployments on Kubernetes expose JMX interfaces to interact with. The snippet from our deployment descriptor, which installs the Kafka Helm chart with JMX enabled, looks like this:

{
  &quot;name&quot;: &quot;banzaicloud-stable/kafka&quot;,
  &quot;release_name&quot;:&quot;demo2&quot;,
  &quot;values&quot;: {
    &quot;jmx&quot;: {
      &quot;enabled&quot;: true
    }
  }
}

Now, once this is enabled, one suboptimal alternative is to go the Pod and use JConsole or VisualVM to do some debugging and monitoring. For example, the JMX port can be proxied and connect to localhost:

kubectl port-forward kafka-0 5555
jconsole 127.0.0.1:5555

This opens up some options for monitoring, but at the same time raises some questions (for example, about security). Also, you'll have to run a new JVM to collect the available information through JMX and secure the channel somehow.

Prometheus and the JVM

The folks at Prometheus have a nice solution to all of the above. They've written a collector that can configurably scrape and expose the mBeans of a JMX target. It runs as a Java Agent, exposing an HTTP server and serving metrics of the local JVM. It can also be run as an independent HTTP server and scrape remote JMX targets, but this comes with various disadvantages, such as it being harder to configure and it being unable to expose process metrics (e.g., memory and CPU usage). Running the exporter as a Java Agent is thus strongly encouraged. We have forked this exporter, enhanced it a bit with a Dockerfile, which adds support for the options we've listed.

Connect to an exposed JMX port of the JVM (not recommended)
Java agent version (recommended)

If you use the agent version, you'll have to modify three configuration options:

the Jar file location
the port for the http(s) interface, where the metrics will be available for scraping, already in Prometheus friendly format
additional configuration options

An example looks like this:

-javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent-0.3.1-SNAPSHOT.jar=9020:/etc/jmx-exporter/config.yaml

Advanced Kafka monitoring scenarios with Pipeline

If Kafka is being deployed with Pipeline, all the additional configuration parameters are available in our GitHub repository. As you can see, our Kafka Helm chart is set up to use an init container, which copies the previously mentioned JAR file to a specified mount, which is used in read-only mode by the Kafka container. In the Banzai Cloud Kafka Helm chart, we use a stateful-set annotated with the below values, so there will be a Pod port whenever the jmx-exporter is scraped via a http(s) interface by the Prometheus server. The overview of the deployment looks like this kafka-jmx-exporter

annotations:
  prometheus.io/scrape: &quot;true&quot;
  prometheus.io/probe: kafka
  prometheus.io/port: &quot;9020&quot;

In the end, all of the above Pods will have a KAFKA_OPTS environment variable -javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent-0.3.1-SNAPSHOT.jar=9020:/etc/jmx-exporter/config.yaml The configuration of the JMX exporter is pushed into a Kubernetes configmap, so we can dynamically change it, if needed, and changes are automatically reflected at the next scrape.

Please note that our configuration is a bit more complex than usual, because, as is the case with us, running a PaaS necessitates several advanced features. We do configs of BlackLists or WhiteLists for objects and we can generate arbitrary metric names for those objects, as well. See the example below:

lowercaseOutputName: true
   rules:
   - pattern : kafka.cluster&lt;type=(.+), name=(.+), topic=(.+), partition=(.+)&gt;&lt;&gt;Value
     name: kafka_cluster_$1_$2
     labels:
       topic: &quot;$3&quot;
       partition: &quot;$4&quot;

Beside the configurations given above we have several Grafana dashboards that provide us with a detailed overview of the JVM, and of Kafka itself. jmx-overview kafka-metrics In the next post in this series, we'll discuss how to set up alerts in the event of a problem, and how to interact with the Kafka cluster.

Subscribe to

The Shift!

Get emerging insights on innovative technology straight to your inbox.

Welcome to the future of agentic AI: The Internet of Agents

Outshift is leading the way in building an open, interoperable, agent-first, quantum-safe infrastructure for the future of artificial intelligence.

* No email required

Twitter

Facebook

Published on 00/00/0000

Last updated on 00/00/0000

Published on 00/00/0000

Last updated on 00/00/0000

Twitter

Facebook

Monitoring series: Monitoring Apache Spark with Prometheus Monitoring multiple federated clusters with Prometheus - the secure way Application monitoring with Prometheus and Pipeline Building a cloud cost management system on top of Prometheus Monitoring Spark with Prometheus, reloaded Kafka on Kubernetes the easy way

Note that we have removed Zookeeper from Kafka and we use Etcd instead. For further details please read this post.

Using existing tools

Apache Kafka deployments on Kubernetes expose JMX interfaces to interact with. The snippet from our deployment descriptor, which installs the Kafka Helm chart with JMX enabled, looks like this:

{
  &quot;name&quot;: &quot;banzaicloud-stable/kafka&quot;,
  &quot;release_name&quot;:&quot;demo2&quot;,
  &quot;values&quot;: {
    &quot;jmx&quot;: {
      &quot;enabled&quot;: true
    }
  }
}

kubectl port-forward kafka-0 5555
jconsole 127.0.0.1:5555

Prometheus and the JVM

Connect to an exposed JMX port of the JVM (not recommended)
Java agent version (recommended)

If you use the agent version, you'll have to modify three configuration options:

the Jar file location
the port for the http(s) interface, where the metrics will be available for scraping, already in Prometheus friendly format
additional configuration options

An example looks like this:

-javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent-0.3.1-SNAPSHOT.jar=9020:/etc/jmx-exporter/config.yaml

Advanced Kafka monitoring scenarios with Pipeline

annotations:
  prometheus.io/scrape: &quot;true&quot;
  prometheus.io/probe: kafka
  prometheus.io/port: &quot;9020&quot;

Please note that our configuration is a bit more complex than usual, because, as is the case with us, running a PaaS necessitates several advanced features. We do configs of BlackLists or WhiteLists for objects and we can generate arbitrary metric names for those objects, as well. See the example below:

lowercaseOutputName: true
   rules:
   - pattern : kafka.cluster&lt;type=(.+), name=(.+), topic=(.+), partition=(.+)&gt;&lt;&gt;Value
     name: kafka_cluster_$1_$2
     labels:
       topic: &quot;$3&quot;
       partition: &quot;$4&quot;

by

Gabor Kozma

Published on 03/11/2018

Last updated on 02/03/2025

Published on 03/11/2018

Last updated on 02/03/2025

Monitoring Apache Kafka With Prometheus

Get emerging insights on innovative technology straight to your inbox.

Using existing tools

Prometheus and the JVM

Advanced Kafka monitoring scenarios with Pipeline

Welcome to the future of agentic AI: The Internet of Agents

Published on 00/00/0000

Last updated on 00/00/0000

Published on 00/00/0000

Last updated on 00/00/0000

by

Gabor Kozma

Published on 03/11/2018

Last updated on 02/03/2025

Published on 03/11/2018

Last updated on 02/03/2025

Monitoring Apache Kafka With Prometheus

Get emerging insights on innovative technology straight to your inbox.

Using existing tools

Prometheus and the JVM

Advanced Kafka monitoring scenarios with Pipeline

Welcome to the future of agentic AI: The Internet of Agents

Related articles

Security

Kubernetes and multi-cloud security

In-depth Tech

Accordion - A cloud native framework to enable fast SDLC for SaaS and on-prem projects

In-depth Tech

KubeClarity: Vulnerability scanning