Outshift Logo


14 min read

Blog thumbnail
Published on 05/17/2022
Last updated on 04/17/2024

Building and deploying AI applications on K3s


AI is hot. Edge is hot. Kubernetes is hot. It's easy to arrive at the question: What does it take to combine those three and run an AI model on an Edge-focused Kubernetes setup? Today we're going to create an example together and find out.

But first, let's set the scene. We know that AI at the Edge is a growing trend — inference computation of AI models works at the edge of the network, near users and data. This can come with several advantages over running the AI models centrally in the cloud, such as reducing bandwidth costs since model input data doesn't need to be transferred, enabling real-time decision capabilities at the edge, and complying with data privacy regulations.

At the same time, developers focusing on the edge want to continue using technologies that have grown with the popularity of cloud computing, such as using Kubernetes for automatically deploying, scaling, and managing their containerised applications. The challenge is that running Kubernetes on edge devices is not always suitable, as they can often have constrained CPU and memory resources.

Running AI applications on edge devices using Kubernetes K3s

Thankfully, K3s has emerged to fill this niche, and offers a simple and lightweight distribution of Kubernetes that is also optimized to run on ARM devices. This means that we can now consider deploying, scaling, and managing containerized AI inference applications on edge devices, and this Tech Blog aims to help you get started on this path.

We're going to cover several steps today to help you create and run a simple AI application in Python. We will then generate a Docker image for that application and upload the image to Harbor. Finally we will create a Helm Chart so that the app can run on your local K3s cluster, and finish up with some next steps that could continue this tutorial.

To get the most out of this tutorial, you should already have:

Creating a simple AI app

With the above requirements setup, we'll start by creating a simple AI application that connects to an RTSP stream, runs an ONNX model for classifying the frames, and writes the output to a topic of an MQTT Broker. The model that we've chosen to use for this tutorial is the GoogleNet ONNX model, one of many models trained to classify images based on the 1000 classes of ImageNet.

To get started, we first need to set up a new directory where we will create and store our tutorial files. So let's create the directory locally and name it 'minimal_ai'.

The main file of our minimal AI application will do the heavy lifting of connecting to an RTSP stream, downloading and running an ONNX model for classifying the frames, matching the inferences to a download set of classes, and writing the class names to an MQTT Broker topic. It will take in three command line arguments: the URL of the RTSP stream, the url of the MQTT Broker, and the MQTT topic. 

We've created the following minimal example of this application, so now create the file 'minimal_ai_app.py' in the 'minimal_ai' directory, copy the code below and save the file:

# file: minimal_ai/minimal_ai_app.py

import sys
import rtsp
import onnxruntime as ort
import numpy as np
import paho.mqtt.client as mqtt
import requests
from preprocess import preprocess

if __name__ == '__main__':

    python3 minimal_ai_app.py <url of RTSP stream> <host of MQTT Broker> <MQTT topic>

    if len(sys.argv) != 4:
        raise ValueError("This demo app expects 3 arguments and has %d" % (len(sys.argv) - 1))

    # Load in the command line arguments
    rtsp_stream, mqtt_broker, mqtt_topic = sys.argv[1], sys.argv[2], sys.argv[3]

    # Download the model
    model = requests.get('https://github.com/onnx/models/raw/main/vision/classification/inception_and_googlenet/googlenet/model/googlenet-12.onnx')
    open("model.onnx" , 'wb').write(model.content)
    session = ort.InferenceSession("model.onnx")
    inname = [input.name for input in session.get_inputs()]

    # Download the class names
    labels = requests.get('https://raw.githubusercontent.com/onnx/models/main/vision/classification/synset.txt')
    open("synset.txt" , 'wb').write(labels.content)
    with open("synset.txt", 'r') as f:
        labels = [l.rstrip() for l in f]

    # Connect to the MQTT Broker
    mqtt_client = mqtt.Client()

    # Connect to the RTSP Stream
    rtsp_client = rtsp.Client(rtsp_server_uri = rtsp_stream)
    while rtsp_client.isOpened():

        # read a frame from the RTSP stream
        img = rtsp_client.read()
        if img != None:

            # preprocess the image
            img = preprocess(img)

            # run the model inference, extract most likely class
            preds = session.run(None, {inname[0]: img})
            pred = np.squeeze(preds)
            a = np.argsort(pred)[::-1]

            # print output and publish to MQTT broker
            mqtt_client.publish(mqtt_topic, labels[a[0]])


Like many AI applications, this can almost run by itself, except we need to make sure that the input frames are preprocessed in the way that the model expects. In the case of the GoogleNet ONNX model, we can use the 'preprocess' function provided online here. Therefore, we create the file 'preprocess.py' in the 'minimal_ai' directory, copy the preproccess function, and import numpy at the top of the file:

# file: minimal_ai/preprocess.py
# from https://github.com/onnx/models/tree/main/vision/classification/inception_and_googlenet/googlenet#obtain-and-pre-process-image

import numpy as np

# Pre-processing function for ImageNet models using numpy
def preprocess(img):
    Preprocessing required on the images for inference with mxnet gluon
    The function takes loaded image and returns processed tensor
    img = np.array(img.resize((224, 224))).astype(np.float32)
    img[:, :, 0] -= 123.68
    img[:, :, 1] -= 116.779
    img[:, :, 2] -= 103.939
    img[:,:,[0,1,2]] = img[:,:,[2,1,0]]
    img = img.transpose((2, 0, 1))
    img = np.expand_dims(img, axis=0)

    return img

With the preprocessing file and function added, you can test this Python app with the following commands (replacing [your-rtsp-stream-url] with the public RTSP stream and [your-mqtt-topic] with a suitable topic):

For this tutorial, we have chosen to use the public MQTT broker provided by broker.hivemq.com. Therefore, please choose a unique topic name for your demo setup, since the public MQTT broker can have other users also connecting to it and publishing messages.

# install the required packages
pip install onnxruntime rtsp numpy paho-mqtt requests

# run the demo application
python minimal_ai_app.py [your-rtsp-stream-url] broker.hivemq.com [your-mqtt-topic]

Running this command should download the model and classnames, connect to the RTSP stream, and run an AI model on the frames after applying preprocessing. Based on the model predictions, the most likely image class from ImageNet should be printed in the terminal, and published to the MQTT Broker and topic.

Since we are using the public MQTT broker from HiveMQ you can view the results of your deployed model’s predictions with the online MQTT Client from HiveMQ. To do so, navigate to the MQTT Client in your browser, select 'Connect,' then 'Add New Topic Subscription.' Enter the same value for [your-mqtt-topic] that you used when running minimal_ai_app.py, then click 'Subscribe.'

Alternatively, install the Mosquitto package on your local machine, then open a terminal. By running the command below, and replacing [your-mqtt-topic] with the topic used when running minimal_ai_app.py, you will subscribe to the public broker and should see prediction messages appearing in the terminal window.

mosquitto_sub -h broker.hivemq.com -p 1883 -t [your-mqtt-topic]

Creating the Docker image

Now that our simple Python application is running locally, we can containerize it using a Dockerfile. The Dockerfile example below will define an image based on python:3.9, pip install the required packages, and define the command to run when the container starts. In this case, the command is similar to one used above to test locally. Still, the command line arguments are provided by environment variables that will be set when running the container. Create a file called 'Dockerfile' in the 'minimal_ai' directory, and copy the following content into the file and save:

# file: minimal_ai/Dockerfile

FROM python:3.9

WORKDIR /usr/src/app

COPY *.py .

RUN apt-get update && \
    apt-get install ffmpeg libsm6 libxext6 -y && \
    pip install --no-cache-dir --upgrade pip && \
    pip install --no-cache-dir onnxruntime rtsp numpy paho-mqtt requests

CMD python -u minimal_ai_app.py $RTSP_STREAM $MQTT_BROKER $MQTT_TOPIC

With the Dockerfile created, we can build a local version of the image and name it 'minimal-ai-app' by running the following command in the 'minimal_ai' directory:

docker build -t minimal-ai-app .

Once the image has been created, it can be run locally using docker to check that everything was defined correctly. The command below can be used to run the minimal_ai image, and you should replace [your-rtsp-stream-url] with the public RTSP stream and [your-mqtt-topic] with a suitable topic.

Please choose a unique topic name for your demo setup, since the public MQTT broker can have other users also connecting to it and publishing messages.

docker run -d \
    -e RTSP_STREAM=[your-rtsp-stream-url] \
    -e MQTT_BROKER=broker.hivemq.com \
    -e MQTT_TOPIC=[your-mqtt-topic] \
    --name minimal-ai-container minimal-ai-app

Uploading a multi-arch image to Harbor

To run our image on our K3s cluster we need to make it available to download. For this tutorial, we are going to store the image in Harbor, an open source registry. For this purpose, we are going to use a demo instance that has been made available by Harbor to experiment and test features. 

First, go the the Test Harbor with the Demo Server page, and follow the instructions under 'Access the Demo Server' to sign up and create an account. Create a new project and make sure to tick the 'Public' box for the Access Level. Some of the commands and files in the remainder of the tutorial will refer to the created project as [your-project-name].

Once the project has been created, open a terminal and login in to Harbor with the command below, providing the credentials that you used when creating your account:

docker login demo.goharbor.io

Since we are focusing on running an application on K3s, which is optimised to also run on ARM devices, we can consider building images of our simple AI application for both Intel 64-bit and Arm 64-bit architectures. To do so, we can make use of the Docker Buildx features. The first step is to create and start a new Buildx builder with the following commands:

# create a new buildx builder for multi-arch images
docker buildx create --name demobuilder

# switch to using the new buildx builder
docker buildx use demobuilder

# inspect and start the new buildx builder
docker buildx inspect --bootstrap

The final step here is to build the multi-arch image and upload it so that it appears in your Harbor project. Using the command below, replacing [your-project-name] with the project name you chose, build and push the Intel 64-bit and Arm 64-bit images:

docker buildx build . --platform linux/amd64,linux/arm64 -t demo.goharbor.io/[your-project-name]/minimal-ai-app --push

Creating the Helm Chart

The final things that we need to build are the elements of the Helm Chart for deploying the container image to our K3s cluster. Helm Charts help us describe Kubernetes applications and their components, rather than creating YAML files for every application, you can provide a Helm chart and use Helm to deploy the application for you. We will create a very basic Helm Chart that will contain a template for the Kubernetes resource that will form our application, and a values file to populate the template placeholder values.

The first step is to create a directory called 'chart' inside the 'minimal_ai' directory, this will be where we will create our Helm Chart. A Chart.yaml file is required for any Helm Chart, and contains high level information about the application, you can find out more in the Helm Documentation. Inside the 'chart' directory create a file called 'Charts.yaml', copy the following content and save:

# file: minimal_ai/chart/Chart.yaml

name: minimal-ai-app
description: A Helm Chart for a minimal AI application running on K3s
version: 0.0.1
apiVersion: v1

The next step is to create a directory called 'templates' inside the 'chart' directory, this will be where we will create the template file for our application. When using Helm to install a chart to Kubernetes, the template rendering engine will be used to populated the files in the templates directory with the desired values for the deployment. Create the file 'minimal-ai-app-deployment.yaml' inside the 'templates' directory, and copy the following content into the file and save:

# file: minimal_ai/chart/templates/minimal-ai-app-deployment.yaml

apiVersion: apps/v1
kind: Deployment
  name: {{ .Values.name }}
  replicas: {{ .Values.replicaCount }}
      run: {{ .Values.name }}
        run: {{ .Values.name }}
        - env:
          - name: RTSP_STREAM
            value: {{ .Values.args.rtsp_stream }}
          - name: MQTT_BROKER
            value: {{ .Values.args.mqtt_broker }}
          - name: MQTT_TOPIC
            value: {{ .Values.args.mqtt_topic }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          name: {{ .Values.name }}
      restartPolicy: Always

The parts of the deployment file above that are enclosed in {{ and }} blocks, such as {{ .Values.name }}, are called template directives. The template directives will be populated by the template rendering engine, and in this case look for information from the values.yaml file - which contains the default values for a chart.

Therefore, the final component that we have to create is the 'values.yaml' file, which you should create in the 'chart' directory. Inside the values.yaml file we need to define default the values for the template directives in the deployment file. Replacing [your-project-name] with the project name you used in Harbor, [your-rtsp-stream-url] with the public RTSP stream, and [your-mqtt-topic] with a suitable topic, copy the following content into the file, and save.

Please choose a unique topic name for your demo setup, since the public MQTT broker can have other users also connecting to it and publishing messages.

# file: minimal_ai/chart/values.yaml

replicaCount: 1

name: "minimal-ai-app"

    repository: demo.goharbor.io/[your-project-name]/minimal-ai-app
    tag: latest

    rtsp_stream: [your-rtsp-stream-url]
    mqtt_broker: broker.hivemq.com
    mqtt_topic: [your-mqtt-topic]

With the Helm Chart now complete, we can use Helm to install the chart to our local K3s cluster and deploy the application. The following command will install the minimal_ai application on your cluster:

helm install minimal_ai chart

If everything has been configured and set up correctly, the chart will be installed by Helm. This will create the deployment needed to run our simple AI application in a Kubernetes pod. Connecting to the logs of the running pod should show the same inferences we saw earlier in the tutorial being printed out, and connecting to the MQTT topic as we did before should show the same output. 

When we tested this simple AI application on a Raspberry Pi 4 8GB as part of an Edge K3s cluster, connecting it to a 1080x720 RTSP stream at 30.00 FPS, we were able to see the inferences being published to the public MQTT Broker at around 4 FPS.

Building AI applications: Next steps

AI business solutions extend far beyond the use cases above. From here, you might see the many possibilities for building AI applications on K3s. I hope you've enjoyed this experiment we did together "Building and Deploying an AI Application on K3s." The steps here should help you start creating and running a simple AI application in Python, generating a Docker image and uploading it to Harbor, and creating a Helm Chart to run the app on a local K3s cluster.

To help make this guide as streamlined as possible we took a few shortcuts, and so there are several next steps you might consider taking to continue building off this tutorial. For example:

  • The packages in the pip install command of the Dockerfile could be in a requirements.txt file
  • You could host your own instance of an MQTT Broker, and replace the public MQTT Broker used in this guide
  • The model could be more complex, such as performing Object Detection and outputting bounding boxes as part of the inference and post-processing
  • You could host your own instance of Harbor, or make the Harbor project private, and pull the Image from your Private Registry

For even more access to the possibilities of AI business solutions, visit our Architect’s Guide to the AIoT series.

Subscribe card background
Subscribe to
the Shift!

Get emerging insights on emerging technology straight to your inbox.

Unlocking Multi-Cloud Security: Panoptica's Graph-Based Approach

Discover why security teams rely on Panoptica's graph-based technology to navigate and prioritize risks across multi-cloud landscapes, enhancing accuracy and resilience in safeguarding diverse ecosystems.

the Shift
emerging insights
on emerging technology straight to your inbox.

The Shift keeps you at the forefront of cloud native modern applications, application security, generative AI, quantum computing, and other groundbreaking innovations that are shaping the future of technology.

Outshift Background