Outshift Logo

INSIGHTS

13 min read

Blog thumbnail
Published on 06/02/2023
Last updated on 04/11/2024

KubeClarity: Multi SBOM integration

Share

Lean Into Software Supply Chain Security with KubeClarity Series

https://github.com/openclarity/kubeclarity

In our previous post in the KubeClarity series, we dove into the intricacies of KubeClarity architecture to uncover its inner workings. It's time to shift our focus to the standout feature that sets it apart: multi-SBOM (Software Bill of Materials) integration. Join me in this exciting blog post as we explore the power of multi-SBOM integration and discover how it can bolster your security strategy.



Understanding Multi SBOM Integration with KubeClarity

Figure-1: Understanding Multi SBOM Integration with KubeClarity

Software Bill of Materials (SBOM) recap

Look at the SBOM fundamentals covered in our previous post as a refresher before diving into advanced SBOM Integration. While KubeClarity offers a standard SBOM integration approach like other solutions, it goes a step further with its advanced multi-SBOM integration capability. This post starts with the single SBOM integration and progressively delves into the multi-SBOM strategy.

Just to remind you, KubeClarity is not an SBOM generator but a solution that integrates and operationalizes popular SBOM generators. Let’s look at how it sets SBOMs in motion. So, get ready to take your security game to the next level with KubeClarity!


SBOM generator integration

KubeClarity exposes SBOM generator integration settings via "values.yaml". Following the instructions in README or a more detailed installation blog, you can explore and adjust the default helm chart values. The “values.yaml” is the starting point for looking at the options for SBOM integrations. 

KubeClarity, by default, enables Syft and CycloneDX gomod analyzers. Adding new types of analyzers to the list of generators in the values config file can enable additional integrations. For example, Trivy is now available as an additional analyzer, providing enhanced analysis capabilities. However, the architecture allows for seamless extensibility, ensuring that future releases can easily incorporate additional analyzers. Here is the corresponding config snippet from the “values. yaml” file:

    analyzer:
      ## Space seperated list of analyzers. (syft gomod)
      analyzerList: "syft gomod"

      analyzerScope: "squashed"

      trivy:
        ## Enable trivy scanner, if true make sure to add it to list above  
        enabled: false
        timeout: "300"

KubeClarity SBOM DB

The generated SBOMs are cached in the SBOM DB. KubeClarity installation auto-deploys an SBOM DB pod. As a recap, SBOM DB is a lightweight SQLite DB that avoids persistent volume storage overheads. Its primary purpose is to store and retrieve SBOM documents in a string format and serve as a caching function for rendering SBOM data. The DB does not store or query JSON objects to parse or query the SBOMs. However, it supports a gzip compression and base64 encoded storage to reduce memory footprint.

Here is the corresponding config snippet from the “values.yaml” file:

## KubeClarity SBOM DB Values

kubeclarity-sbom-db:
  ## Docker Image values.
  docker:
    ## Use to overwrite the global docker params
    ##
    imageName: ""

  ## Logging level (debug, info, warning, error, fatal, panic).
  logLevel: warning

  servicePort: 8080

  resources:
    requests:
      memory: "20Mi"
      cpu: "10m"
    limits:
      memory: "100Mi"
      cpu: "100m"

## End of KubeClarity SBOM DB Values

KubeClarity can accept and process SPDX and CycloneDX input formats and convert them to KubeClarity's native format, which aligns with the CycloneDX format. Check out the details on supported output formats in the README.

The heavy lifting is generating the SBOM and converting it to the native format. Once the SBOM is generated and stored in the cache, mapping the content analysis to vulnerabilities is a relatively quick process and takes under a few seconds. We will investigate scanning configurations and vulnerabilities in the next post.


Single SBOM shortcomings

We have discussed this before in the SBOM fundamentals recap; It is important to note that not all SBOMs can be generated equally. For Instance, let’s look at some of the variances between the two popular analyzers.

Trivy and Syft are open-source analyzers that can generate SBOMs (Software Bill of Materials) for containerized applications. However, both tools have unique strengths and weaknesses when detecting libraries in a container.

Trivy's strength lies in its extensive vulnerability database, which includes CVEs from various sources such as NVD, Red Hat, and Debian. In addition, Trivy's database is continuously updated, and it can detect vulnerabilities in multiple programming languages, including Java, Python, and Ruby.

In contrast, Syft's vulnerability database is smaller than Trivy's, primarily focusing on detecting vulnerabilities in Python libraries.

Therefore, Trivy's SBOM is likely better than Syft's SBOM when detecting vulnerabilities in containerized applications that use programming languages other than Python or if the container image contains many libraries with known vulnerabilities.

However, Syft may be a better choice if the application relies heavily on Python libraries since Syft has a more extensive database of Python vulnerabilities and can provide more accurate vulnerability information for those libraries. Ultimately, the choice of which tool to use depends on the specific requirements and characteristics of the analyzed containerized application.

Wouldn't it be great if you didn't have to choose between options? What if you could have both or as many as you need? Achieving that requires effort to ensure that these analyzers coexist and that their input and output formats are normalized. In the next section, we will explore a unique feature of KubeClarity that addresses this very challenge. Keep going to learn more about how KubeClarity sets itself apart.


Multi SBOM solution

The solution integrates multiple SBOMs to generate an accurate pedigree of software packages and libraries. Integrating multi-SBOM helps increase the coverage and accuracy of detection.

Managing multiple SBOMs may seem quite challenging, given the complexities of handling a single one. However, Kubeclarity can efficiently organize and process multiple SBOMs, transforming what may seem like chaos into a valuable source of insights. Furthermore, with seamless integration of multiple SBOMs, including the ability to export external SBOMs provided by the user, KubeClarity simplifies the process.

Unifying SBOM formats

Figure 2 illustrates that different analyzers may support diverse formats for SBOMs. In this context, KubeClarity plays a crucial role by ingesting these varied formats and converting them into the native format required by vulnerability scanners. Since each vulnerability scanner expects SBOMs in specific formats, merging SBOMs involves the bulk of the work, requiring careful balancing and standardization of inputs to ensure compatibility.

 Multi-SBOM Integration Process

Figure 2: Multi-SBOM Integration Process

When multiple analyzers identify the same resources, KubeClarity handles them as a union and labels both analyzers as the source. Instead of attempting to merge the raw data produced by each generator, KubeClarity adds additional metadata to the generated SBOMs while keeping the raw data untouched, as reported by the analyzers. ­

Merging SBOM

In addition to merging multiple SBOMs, KubeClarity also offers to merge SBOMs from various stages of a CI/CD pipeline into a single SBOM.

As illustrated in Figure-3 below, it is possible to augment SBOMs by layering and merging. Here, you can see application dependency SBOM analysis from application build time can be augmented with the image dependencies analysis during the image build phase. The merged SBOMs serve as inputs to vulnerability scanners after due formatting. We will cover vulnerability scanning in depth in the next post. Let’s stay focused on SBOM for now.

SBOM Integrations at Various CI/CD stages

Figure 3: SBOM Integrations at Various CI/CD Stages

Source code

If you're interested in exploring the code related to SBOM formatting, conversion, and merging, you can find most of it within the shared package. Feel free to look around. Figure-4 provides an overview of the codebase briefly.

source_code_SBOM

Figure 4: SBOM Integration Source Code Layout

Exploring SBOM integration: Hands-on exercise

There are a couple of methods to configure KubeClarity with your preferred analyzers. The first option is to use the "Kubeclarity-cli" tool, which is useful for integrating into your CI/CD pipelines. The second option involves using the "values.yaml" configuration file. Let's explore both of these options in detail below:

kubeclarity-cli

If you'd like to follow along with the instructions, you can find some of these guidelines in the README as well.

kubeclarity-cli analyze <image/directory name> --input-type <dir|file|image(default)> -o <output file or stdout>

Example:

kubeclarity-cli analyze --input-type image nginx:latest -o nginx.sbom

Example Output:

INFO[0000] Called syft analyzer on source registry:nginx:latest  analyzer=syft app=kubeclarity
INFO[0004] Skipping analyze unsupported source type: image  analyzer=gomod app=kubeclarity
INFO[0004] Sending successful results                    analyzer=syft app=kubeclarity
INFO[0004] Got result for job "syft"                     app=kubeclarity
INFO[0004] Got result for job "gomod"                    app=kubeclarity
INFO[0004] Skip generating hash in the case of image    

Verify that the “ngnix.sbom” file is generated and explore its contents as in below:

head ngnix.sbom

Example Output:

{
  "bomFormat": "CycloneDX",
  "specVersion": "1.4",
  "serialNumber": "urn:uuid:8cca2aa3-1aaa-4e8c-9d44-08e88b1df50d",
  "version": 1,
  "metadata": {
    "timestamp": "2023-05-19T16:27:27-07:00",
    "tools": [
      {
        "vendor": "kubeclarity",

If you're interested in viewing the entire file, you can utilize the "cat" command. However, please be aware that it is a large file.

cat ngnix.sbom

This should give you a fair idea of how to run some analysis with KubeClarity in a standalone mode to explore the internals of SBOM.

Here is a sample if you want to try adding trivy to your list and using multiple analyzers to generate a single merged SBOM.

ANALYZER_LIST="syft gomod trivy" kubeclarity-cli analyze --input-type image nginx:latest -o nginx.sbom

Example Output:

INFO[0000] Called syft analyzer on source registry:nginx:latest  analyzer=syft app=kubeclarity
INFO[0004] Called trivy analyzer on source image nginx:latest  analyzer=trivy app=kubeclarity
INFO[0004] Skipping analyze unsupported source type: image  analyzer=gomod app=kubeclarity
INFO[0005] Sending successful results                    analyzer=syft app=kubeclarity
INFO[0005] Sending successful results                    analyzer=trivy app=kubeclarity
INFO[0005] Got result for job "trivy"                    app=kubeclarity
INFO[0005] Got result for job "syft"                     app=kubeclarity
INFO[0005] Got result for job "gomod"                    app=kubeclarity
INFO[0005] Skip generating hash in the case of image   

Optionally, you can export the results to a running KubeClarity backend pod with the use of  “-e” flag. To export the analyzer results to the backend, grab the "ID"  from the KubeClarity UI and use it in the command below for <application ID> value.

In the following Figure-5, an illustration demonstrates how you can obtain the ID of an application resource such as an image, pod, file, and so on and utilize it for CLI purposes. In the provided example below, let's copy the ID of the "nginx" image, as indicated.

Get Application ID from UI For SBOM generator Integration

Figure 5: Get Application ID from UI

Once you grab the “ID” from the UI for an existing resource that you wish to scan, use the following command:

BACKEND_HOST=<KubeClarity backend address> BACKEND_DISABLE_TLS=true kubeclarity-cli analyze <image> --application-id <application ID> -e -o <SBOM output file>

Here is an example run of this command after plugging in “ID” from the UI:

BACKEND_HOST=localhost:8080 BACKEND_DISABLE_TLS=true kubeclarity-cli analyze nginx:latest --application-id 23452f9c-6e31-5845-bf53-6566b81a2906 -e -o nginx.sbom

 Example output:

INFO[0000] Called syft analyzer on source registry:nginx:latest  analyzer=syft app=kubeclarity
INFO[0004] Called trivy analyzer on source image nginx:latest  analyzer=trivy app=kubeclarity
INFO[0004] Skipping analyze unsupported source type: image  analyzer=gomod app=kubeclarity
INFO[0004] Sending successful results                    analyzer=syft app=kubeclarity
INFO[0004] Got result for job "syft"                     app=kubeclarity
INFO[0004] Got result for job "gomod"                    app=kubeclarity
INFO[0004] Sending successful results                    analyzer=trivy app=kubeclarity
INFO[0004] Got result for job "trivy"                    app=kubeclarity
INFO[0004] Skip generating hash in the case of image    
INFO[0004] Exporting analysis results to the backend: localhost:8080  app=kubeclarity

That’s it! Now you can see exported results on the UI:

Observe the change in the "nginx" image in Figure-6 below, where the "SBOM Analyzers" column now displays "trivy" to credit the trivy analysis.

Figure-6: Demonstrating Successful Multi-SBOM Analyzer Integration

Figure-6: Demonstrating Successful Multi-SBOM Analyzer Integration

Helm Install values.yaml

If you want to try to enable multi-analyzer integration during install time, here is how to do it.

Follow the detailed instructions from one of the previous blogs Docker Installation or AWS EKS Installation depending on your environment. You can also follow the instructions in README here.

Once you have the install instructions handy, run the following command to check the helm chart values as below:

helm show values kubeclarity/kubeclarity > values.yaml

Edit values.yaml to set analyzer config save and complete the KubeClarity installation.


Exploring SBOM DB

Because SBOM DB serves as an in-memory cache, it is not stored in a persistent store. This ephemeral store can be found at location /tmp/db. It can be located by following the below commands:

kubectl get pods -n kubeclarity

Example Output

kubeclarity-kubeclarity-7dbd967d4d-6b8bd              1/1     Running   0          51d
kubeclarity-kubeclarity-grype-server-d64c85c4-nbxn6   1/1     Running   0          56d
kubeclarity-kubeclarity-postgresql-0                  1/1     Running   0          23d
kubeclarity-kubeclarity-sbom-db-7cfdc5bb55-58zl2      1/1     Running   0          51d

Grab the SBOM DB instance ID and get into it

kubectl exec -n kubeclarity -it kubeclarity-kubeclarity-sbom-db-7cfdc5bb55-58zl2  -c sbom-db -- /bin/sh

The entry into the container will take you to /app folder. You can navigate to the /tmp/db folder as in below and explore the SBOM in your deployment.

/app $ cd /tmp
/tmp $ls
db.db

Feel free to utilize the "cat" command to peek into the contents of the "db.db" file. Upon inspection, you will notice that the data is stored in JSON format as shown in Figure-7 below, specifically as strings. Take this opportunity to explore and observe the content firsthand.

cat db.db

Example Output:

SBOM DB Data Stored in Json Format

Figure-7: SBOM DB Data Stored in Json Format

Exploring multi-SBOM UI dashboards

Figure-8 below shows an example UI dashboard showcasing the expected outcome when multiple analyzers report identical resources. Both analyzers will be labeled as the source analyzers for those resources in such instances. This attribution and pedigree tracking mechanism aids in accurately tracking the origins of the resources.  There are many more nuances and intricacies, which are hard to describe exhaustively here, best explored than described. So, get your hands on it and explore it for yourself.

Multi-SBOM Analyzer Attribution in UI

Figure-8: Multi-SBOM Analyzer Attribution in UI

Running software bills of materials analyzers

  • KubeClarity can run multiple SBOM analyzers and tag the SBOM data with analyzer metadata for easy identification of the source of analysis.
  • It merges data from multiple analyzers into a unified view, ensuring comprehensive coverage.
  • SBOM generation and vulnerability scanning are separated into distinct steps, offering flexibility and a plug-and-play approach to the analysis process.
  • KubeClarity supports multi-stage SBOM merging across various CI/CD pipeline stages.
  • It performs multi-stage CI/CD SBOM analysis, overlaying analysis from different build stages for comprehensive insights.
  • KubeClarity allows the import of external SBOMs, providing the ability to augment analysis with additional data from external sources.
  • The merged SBOM seamlessly integrates with vulnerability scanners for effectively scanning and identifying vulnerabilities.

This is impressive, wouldn't you agree?

Next up in the KubeClarity series

Let's explore the integration of KubeClarity with vulnerability scanners to maximize the benefits of SBOMs for efficient vulnerability management and CVE tracking!



Pallavi Kalapatapu is a Principal Engineer and open-source advocate at Outshift, formerly known as Cisco’s Emerging Technology & Incubation organization.

Subscribe card background
Subscribe
Subscribe to
the Shift!

Get emerging insights on emerging technology straight to your inbox.

Unlocking Multi-Cloud Security: Panoptica's Graph-Based Approach

Discover why security teams rely on Panoptica's graph-based technology to navigate and prioritize risks across multi-cloud landscapes, enhancing accuracy and resilience in safeguarding diverse ecosystems.

thumbnail
I
Subscribe
Subscribe
 to
the Shift
!
Get
emerging insights
on emerging technology straight to your inbox.

The Shift keeps you at the forefront of cloud native modern applications, application security, generative AI, quantum computing, and other groundbreaking innovations that are shaping the future of technology.

Outshift Background