Outshift Logo

INSIGHTS

8 min read

Blog thumbnail
Published on 07/20/2023
Last updated on 04/16/2024

Observability with eBPF and OpenTelemetry

Share

In today's complex and ever-changing IT environments, observability is essential for ensuring the health and performance of your systems. By understanding what is happening in your system, you can identify and troubleshoot problems quickly, improve performance, prevent problems, and make better decisions.

Here are some of the key components of observability:

  • Metrics: Metrics are measurements of the performance of your system.
  • Logs: Logs are a record of events that happen in your system.
  • Traces: Traces are a way of tracking the flow of requests through your system.

By collecting and analyzing these data, users can gain a deep understanding of their system and its performance. This helps in identifying and troubleshooting problems, improving performance, and preventing problems.

How to achieve observability?

This requires a significant amount of work, from choosing the right tools, instrumenting the code, collecting and storing the data for analysis, and generating insights from it. Having a technology that simplifies most of these tasks and also provides integration options with well-known monitoring providers will simplify implementing observability for your infrastructure and application performance monitoring. In this article we demonstrate using eBPF and OpenTelemetry to achieve observability without any code instrumentation.

What is eBPF?

eBPF is a technology that can run sandboxed programs in a privileged context such as the operating system kernel. It extends the capabilities of the kernel safely and efficiently without requiring changing kernel source code or load kernel modules.

eBPF observability is a new technology that allows for greater visibility and control over systems and networks. It does this by providing a way to collect and analyze data from the kernel without having to modify the kernel itself also without any instrumentation of the application code.  This makes it a powerful tool for troubleshooting performance problems, identifying security vulnerabilities, and optimizing system resources.

eBPF_Image
eBPF

What is OpenTelemetry?

OpenTelemetry is an Observability framework and toolkit designed to create and manage telemetry data such as tracesmetrics, and logs. Crucially, OpenTelemetry is vendor- and tool-agnostic, meaning that it can be used with a broad variety of Observability backends, including open source tools like Jaeger and Prometheus, as well as commercial offerings. OpenTelemetry is a Cloud Native Computing Foundation (CNCF) project.

eBPF and OpenTelemetry integration

eBPF provides a powerful mechanism for dynamic tracing and analysis within the Linux kernel, while OpenTelemetry is a set of open standards and tools for collecting, exporting, and visualizing telemetry data. By combining eBPF and OpenTelemetry, you can gain deep visibility into your application's internals, as well as the underlying system, with minimal overhead. This combination enables you to collect fine-grained telemetry data from the kernel and various application layers, providing a comprehensive understanding of your system's behavior and performance.

Why Pixie?

Pixie is an open source observability tool for Kubernetes applications. Pixie uses eBPF to automatically capture telemetry data without the need for manual instrumentation.

Developers can use Pixie to view the high-level state of their cluster (service maps, cluster resources, application traffic) and also drill-down into more detailed views (pod state, flame graphs, individual full body application requests). It offers features like:

  1. Auto-telemetry: Pixie uses eBPF to automatically collect telemetry data such as full-body requests, resource and network metrics, application profiles, and more. 
  2. In-Cluster Edge Compute: Pixie collects, stores, and queries all telemetry data locally in the cluster. Pixie uses less than 5% of cluster CPU and in most cases less than 2%.
  3. Scriptability: PxL, Pixie’s flexible Pythonic query language, can be used across Pixie’s UI, CLI, and client APIs.

Pixie automatically collects the following data:

  1. Protocol traces: Full-body messages between the pods of your applications. Tracing currently supports the following list of protocols: Casandra, SQL, Redis, Kafka, AMQP, HTTPS2 (GRPC), HTTPS, etc.
  2. Resource metrics: CPU, memory, and I/O metrics for your pods. 
  3. Network metrics: Network-layer and connection-level RX/TX statistics. JVM metrics: JVM memory management metrics for Java applications.
  4. Application CPU profiles: Sampled stack traces from your application. Pixie’s continuous profiler is always running to help identify application performance bottlenecks when you need it. Currently supports compiled languages (Go, Rust, C/C++). 

Case study

For the eBPF capabilities demonstration, we picked the Cisco Nexus Dashboard (a platform that transforms Data Center, Cloud Network operations with simplicity, automation and analytics) and deployed eBPF/Pixie in k8s environment to demonstrate its capabilities with zero code instrumentation. We were able to collect the metrics from the infrastructure, application, protocol, cpu/memory/network and JVM stats.

Auto-telemetry

eBPF automatically collect telemetry data such as full-body requests, resource and network metrics, application profiles, and more. The following Grafana dashboard shows the CPU usage, Network RX bytes, and Connection stats generated by eBPF, collected using Pixie and streamed these metrics to Prometheus using OpenTelemetry plugin.

Auto-telemetry

Non-Instrumented Performance Flame Graph

eBPF collects individual processes stack traces (java, Golang, etc.)  to display kernel and user level performance metrics and summarizes overall CPU consumption. If we take a look at the frame-graph of this mond pod, it helps us understand the reason for this spike in CPU utilization. Every ~10ms, the Pixie profiler samples the current stack trace on each CPU. The stack trace includes the function that was executing at the time of the sample, along with the ancestor functions that were called to get to this point in the code.

The collected samples are aggregated across a larger 30 second window that includes thousands of stack traces. These stack traces are then grouped by their common ancestors. At any level, the wider the stack, the more often that function appeared in the stack traces. Wider stack traces are typically of more interest as it indicates a significant amount of the application time being spent in that function.

Colors

  • The background color of each box in the flamegraph adds an extra dimension of data:
  • Dark blue bars indicate K8s metadata info (node, namespace, pod, container).
  • Light blue bars represent user space application code.
  • Green bars represent kernel code.

Flame Graph

Flame Graph

On-demand Traceability

eBPF allows dynamic trace injection to effectively troubleshoot system level issues such as TCP drops, log level changes etc. The following diagram shows the map of TCP drops and retransmits across the Nexus Dashboard cluster. To see the number of drops between the pod pairs, you can hover over an edge in the map view. The color and thickness of the edges indicate an increase in the number of TCP drops.

On-demand Traceability
TCP Drops

The following diagram shows the traffic flow view of the micro-services along with incoming, outgoing traffic with request/bytes throughput and error rate metrics.

Traffic Flow/Service Graph
Traffic Flow/Service Graph

End To End Observability

eBPF along with Pixie can help navigate end to end metrics from node level to all the way to the function level. Lets take a look at how to navigate this flow.

Here, we are looking at the cluster view, that shows the list of nodes in the cluster and its CPU/memory usage along with no.of K8S pods running in each node.

End To End Observability

Cluster view

We can select a node from the cluster view and drill down for the detailed view of the node. This view provides individual node's CPU/Memory/Network resources consumption.

Cluster view

Node view

The namespace view, which shows all the K8S namespaces, along with different metrics like, pod/service counts, disk throughput etc.

Node view

Namespace view

The pod lifetime resources view shows the total resource usage of a pod over its lifetime.

Pod lifetime resources

Pod lifetime resource usage

We can also look at the JVM stats for JVM memory management metrics for java applications.

JVM stats

JVM stats

Conclusion

In this demo, we demonstrated the observability of infrastructure and applications with eBPF by collecting metrics from the Cisco Nexus Dashboard platform without any code instrumentation. We then used the OpenTelemetry plugins available from Pixie to export these metrics to the Prometheus monitoring system to analyze the data using Grafana dashboards.

We have demonstrated how eBPF would help achieve observability for the Cisco Nexus Dashboard platform. eBPF is a powerful tool that can be used to collect data from a wide variety of systems. It is a promising technology for achieving observability in modern cloud-native systems.

Acknowledgements

We would like to express our thanks to Kalyan Ghosh for providing guidance throughout the project and James Chen for his valuable suggestions.

References

Subscribe card background
Subscribe
Subscribe to
the Shift!

Get emerging insights on emerging technology straight to your inbox.

Unlocking Multi-Cloud Security: Panoptica's Graph-Based Approach

Discover why security teams rely on Panoptica's graph-based technology to navigate and prioritize risks across multi-cloud landscapes, enhancing accuracy and resilience in safeguarding diverse ecosystems.

thumbnail
I
Subscribe
Subscribe
 to
the Shift
!
Get
emerging insights
on emerging technology straight to your inbox.

The Shift keeps you at the forefront of cloud native modern applications, application security, generative AI, quantum computing, and other groundbreaking innovations that are shaping the future of technology.

Outshift Background