INSIGHTS
7 min read
Published on 03/10/2019
Last updated on 03/21/2024
Kafka on Kubernetes, the easy way
Share
One of the key features of the Pipeline platform is its ability to automatically provision, manage, and operate different application frameworks through what we call spotguides. Among the many spotguides we support on Kubernetes (Spark, Zeppelin, NodeJS, Golang,
even custom frameworks
- to name a few) Apache Kafka is among the most popular.
We are heavily invested in making it as easy and straightforward as possible to operate Apache Kafka automatically on Kubernetes, and we believe that our current Apache Kafka Spotguide does just that. We're not stopping there, and we highly encourage you to read the roadmap section at the end of this blog.
Some of our older posts about Apache Kafka on Kuberbetes: Kafka on Kubernetes - using etcd Monitoring Apache Kafka with Prometheus Kafka on Kubernetes with Local Persistent Volumes Kafka on Kubernetes the easy way
Kafka on Kubernetes - the way it should be
There are a few solutions out there for people that want to use Kafka on Kubernetes, but I'd argue that none of them provide an end-to-end method of creating, operating and deploying Kafka to Kubernetes without the use of specialized skillsets. Most of these solutions (if not all of them) require, as a prerequisite, a preexisting Kubernetes cluster, Helm or K8s deployment, knowledge ofyaml
, logging and monitoring systems that are pre-deployed and pre-configured, possibly a CI/CD system, marriage
of Kafka and K8s security measures, and, ultimately, Kafka experience. These prerequisites don't usually overlap, so our aim was to automate them and to fast track the Kafka on Kubernetes experience by:
- Automating the creation of Kubernetes clusters on six cloud providers as well as on-premise
- Deploying Kafka's components and creating brokers and a Zookeeper cluster
- Pre-configure Prometheus to monitor all Kafka components with useful default Grafana dashboards
- Centralizing log collection (in object storage, Elastic, etc) using the fluentd/fluent-bit ecosystem
- Externalizing access to Kafka using a dynamically (re)configured Envoy proxy
- Reproducing environments using the built-in Pipeline CI/CD subsystem and storing state in Git
Kafka in action
Once you're logged in to the Pipeline platform, you can proceed directly to the Spotguides section (also, please check our documentation for details on how to add your cloud credentials). The first screen of the wizard/questionnaire will be a request for general information and for a handful of broker properties, which we will apply ourselves. Broker properties can contain anything covered in the original Kafka broker documentation. You can modify this snippet however you want, but it's recommended that you take note of the following (we highly recommended you keep theseas-is
):
- zookeeper.connect: keep this or add your own pre-existing Zookeeper cluster endpoint
- broker.id: this is populated by the Spotguide based on the statefulset's generated pod number
- advertised.listeners: this is populated by the Spotguide and cannot be changed in this release (we'll add the option to later)
- listeners: this is also populated by the Spotguide and cannot be changed in this release (option to be added later)
- log.dirs: this is likewise populated by the Spotguide and cannot be changed in this release (option to later)
kafkacat
.
- To produce messages:
kafkacat -P -b <bootstrap-server:port> -t test_topic
- To consume those messages:
kafkacat -C -<bootstrap-server:port> -t test_topic
Roadmap
So what's next? What we've just demonstrated is already faster and more convenient than most other options, but is still far from perfect. It's not perfect, because, when shit hits the proverbial fan, it requires manual intervention, and, at the same time, lacks a few features that may be required by some Kafka developers. Also, it involves some constraints that, when running on Kubernetes, we believe should be handled differently. The following are all works in progress that will soon be opensourced as part of our Kubernetes operator for Apache Kafka:- Ability to enable or disable a Schema Registry when creating a Spotguide
- Support for multiple open-source Kafka connectors
- Fine Grained Broker Config support
- Fine Grained Broker Volume support
- Fine Grained upscale and downscale support (this will involve radically different Kubernetes technology/a different approach to all Kafka solutions (including the current state of our operator))
- Intelligent failure management including nodes, brokers and disks
- Istio backed external access using the Banzai Cloud Istio operator
- Open-source Kafka UI support, like this
About Banzai Cloud Pipeline
Banzai Cloud’s Pipeline provides a platform for enterprises to develop, deploy, and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security measures — multiple authentication backends, fine-grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, CI/CD, and so on — are default features of the Pipeline platform.Subscribe to
the Shift!
Get emerging insights on innovative technology straight to your inbox.
Welcome to the future of agentic AI: The Internet of Agents
Outshift is leading the way in building an open, interoperable, agent-first, quantum-safe infrastructure for the future of artificial intelligence.
Related articles
Subscribe
to
The Shift
!Get on innovative technology straight to your inbox.
emerging insights
The Shift is Outshift’s exclusive newsletter.
The latest news and updates on generative AI, quantum computing, and other groundbreaking innovations shaping the future of technology.