INSIGHTS
12 min read
Published on 12/15/2021
Last updated on 07/17/2024
Making the Most Out of OpenTelemetry
Share
OpenTelemetry is the observability solution supported by CNCF. It follows the same standards as OpenCensus and OpenTracing, meaning you can more easily prevent vendor lock-in since it decouples your application instrumentation and data export. With OpenTelemetry, you can achieve full application-level observability from its SDKs, agents, libraries, and standards. For more information about OpenTelemetry, please refer to our posts "Introduction to OpenTelemetry,” and ”OpenTelemetry Best Practices.” Here in this post, we’ll guide you through setting it up, implementing traces with OpenTelemetry, and using Grafana, Prometheus, and Jaeger; we’ll then introduce you to AWS Distro.
How to Set Up OpenTelemetry
To get started with OpenTelemetry, you need to meet the following prerequisites:
Have a Kubernetes cluster up and running, or use the following link to set up your Kubernetes cluster using kubeadm. A sample setup should have one master node (control plane) and two worker nodes. To list all the nodes in your Kubernetes cluster run the command below:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
istio-cluster-control-plane Ready master 16m v1.19.1
istio-cluster-worker Ready <none> 16m v1.19.1
istio-cluster-worker2 Ready <none> 16m v1.19.1
Istio should also be up and running; if not, click here for instructions. Once the Istio cluster is set up, use the following command to check the status of all resources:
kubectl get all -n istio-system
NAME READY STATUS RESTARTS AGE
pod/istio-egressgateway-c9c55457b-zzf55 1/1 Running 0 15m
pod/istio-ingressgateway-865d46c7f5-ddpnk 1/1 Running 0 15m
pod/istiod-7f785478df-2c6rx 1/1 Running 0 16m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/istio-egressgateway ClusterIP 10.96.178.69 <none> 80/TCP,443/TCP,15443/TCP 15m
service/istio-ingressgateway LoadBalancer 10.96.60.62 172.19.0.200 15021:32028/TCP,80:31341/TCP,443:31306/TCP,31400:30297/TCP,15443:32577/TCP 15m
service/istiod ClusterIP 10.96.9.127 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 16m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/istio-egressgateway 1/1 1 1 15m
deployment.apps/istio-ingressgateway 1/1 1 1 15m
deployment.apps/istiod 1/1 1 1 16m
NAME DESIRED CURRENT READY AGE
replicaset.apps/istio-egressgateway-c9c55457b 1 1 1 15m
replicaset.apps/istio-ingressgateway-865d46c7f5 1 1 1 15m
replicaset.apps/istiod-7f785478df 1 1 1 16m
The sample Bookinfo application should already be deployed. To verify the status of the application, please run the command below; it will list all the resources deployed for the Bookinfo application:
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/details-v1-79f774bdb9-7zzwm 2/2 Running 0 14m
pod/productpage-v1-6b746f74dc-z7z8m 2/2 Running 0 14m
pod/ratings-v1-b6994bb9-bt9tb 2/2 Running 0 14m
pod/reviews-v1-545db77b95-kbjbg 2/2 Running 0 14m
pod/reviews-v2-7bf8c9648f-ddw5d 2/2 Running 0 14m
pod/reviews-v3-84779c7bbc-27vz6 2/2 Running 0 14m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/details ClusterIP 10.96.48.2 <none> 9080/TCP 14m
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23m
service/productpage ClusterIP 10.96.62.75 <none> 9080/TCP 14m
service/ratings ClusterIP 10.96.195.114 <none> 9080/TCP 14m
service/reviews ClusterIP 10.96.4.60 <none> 9080/TCP 14m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/details-v1 1/1 1 1 14m
deployment.apps/productpage-v1 1/1 1 1 14m
deployment.apps/ratings-v1 1/1 1 1 14m
deployment.apps/reviews-v1 1/1 1 1 14m
deployment.apps/reviews-v2 1/1 1 1 14m
deployment.apps/reviews-v3 1/1 1 1 14m
NAME DESIRED CURRENT READY AGE
replicaset.apps/details-v1-79f774bdb9 1 1 1 14m
replicaset.apps/productpage-v1-6b746f74dc 1 1 1 14m
replicaset.apps/ratings-v1-b6994bb9 1 1 1 14m
replicaset.apps/reviews-v1-545db77b95 1 1 1 14m
replicaset.apps/reviews-v2-7bf8c9648f 1 1 1 14m
replicaset.apps/reviews-v3-84779c7bbc 1 1 1 14m
Now, verify the application from the browser by going to http://<ingress gateway external ip>/productpage
.
Figure 1: Sample BookInfo application
Your setup is now ready. In the next section, we’ll explore how to send these traces to Grafana and Jaeger.
Sending Traces with OpenTelemetry
In the last section, you got Istio up and running, but Istio can integrate with a bunch of other telemetry applications to provide additional functionality. Prometheus, Grafana, and Jaeger are three such applications. Let’s explore each of these, one by one:
Prometheus
Prometheus is an open-source monitoring system that provides a time-series database for metrics. Using Prometheus, you can record metrics, track the health of your application within a service mesh, then use Grafana to visualize those metrics. Istio provides a sample add-on to deploy Prometheus, so proceed to the directory where you have Istio downloaded: cd istio-1.9.0 Deploy Prometheus by using the following command; the output follows:
kubectl apply -f samples/addons/prometheus.yaml
serviceaccount/prometheus created
configmap/prometheus created
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
service/prometheus created
deployment.apps/prometheus created
Now, verify your Prometheus setup:
kubectl get all -n istio-system -l app=prometheus
NAME READY STATUS RESTARTS AGE
pod/prometheus-7bfddb8dbf-xqvdk 2/2 Running 0 2m28s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus ClusterIP 10.96.176.67 <none> 9090/TCP 2m28s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/prometheus 1/1 1 1 2m28s
NAME DESIRED CURRENT READY AGE
replicaset.apps/prometheus-7bfddb8dbf 1 1 1 2m28s
Next, access the Prometheus dashboard:
istioctl dashboard prometheus&
[1] 1034524
http://localhost:9090
Figure 2: Prometheus dashboard
Now that you have Prometheus installed, you can visualize the metrics it collects by installing Grafana.
Grafana
Grafana is an open-source monitoring solution that, when integrated with a time-series database like Prometheus, creates a custom dashboard and gives meaningful insights into your metrics. Using Grafana, you can monitor the health of your application with a service mesh. Similar to Prometheus, Istio provides a sample add-on you can use to deploy Grafana. Simply go to the directory where you have Istio downloaded: cd istio-1.9.0. Deploy Grafana via the following command; again, this is followed by the output code:
kubectl apply -f samples/addons/grafana.yaml
serviceaccount/grafana created
configmap/grafana created
service/grafana created
deployment.apps/grafana created
configmap/istio-grafana-dashboards created
configmap/istio-services-grafana-dashboards created
Go ahead and verify your Grafana setup:
kubectl get all -n istio-system -l app.kubernetes.io/instance=grafana
NAME READY STATUS RESTARTS AGE
pod/grafana-784c89f4cf-mxssg 1/1 Running 0 2m36s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/grafana ClusterIP 10.96.206.141 <none> 3000/TCP 2m36s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/grafana 1/1 1 1 2m36s
NAME DESIRED CURRENT READY AGE
replicaset.apps/grafana-784c89f4cf 1 1 1 2m36s
And access the Grafana dashboard:
istioctl dashboard grafana&
http://localhost:3000
Figure 3: Grafana dashboard
Once you have Grafana up and running, you will see that Grafana bundled up some pre-configured Istio dashboards.
Figure 4: Preconfigured Grafana dashboard
If you click on “Istio Service Dashboard,” you will see a number of metrics. As there is no activity on this server, all the metrics show either a 0 or N/A.
Figure 5: Grafana service dashboard
Let’s try to generate some load to your cluster by running this script, which will access the sample app application page every second (an infinite while loop).
while :; do; curl -s -o /dev/null 172.19.0.200/productpage;done
If you go back to your Grafana dashboard, you’ll start seeing the loads you’ve generated and different metrics.
Figure 6: Grafana service dashboard
With Grafana up and running, let’s move on to tracing using Jaeger.
Jaeger
Jaeger is a distributed tracing system that is open source and uses the OpenTracing specification. It allows users to troubleshoot and monitor transactions in complex distributed systems. Istio also provides a sample add-on to deploy Jaeger, just like with Prometheus and Grafana. So, go to the directory where you have Istio downloaded: cd istio-1.9.0 And deploy Jaeger by using the following command; output follows:
kubectl apply -f samples/addons/jaeger.yaml
deployment.apps/jaeger created
service/tracing created
service/zipkin created
service/jaeger-collector created
Run the following command to verify your Jaeger setup:
kubectl get all -n istio-system -l app=jaeger
NAME READY STATUS RESTARTS AGE
pod/jaeger-7f78b6fb65-4n6dd 1/1 Running 0 2m10s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/jaeger-collector ClusterIP 10.96.255.86 <none> 14268/TCP,14250/TCP 2m9s
service/tracing ClusterIP 10.96.30.136 <none> 80/TCP 2m10s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/jaeger 1/1 1 1 2m10s
NAME DESIRED CURRENT READY AGE
replicaset.apps/jaeger-7f78b6fb65 1 1 1 2m10s
Now, access the Jaeger dashboard:
istioctl dashboard jaeger&
1096336
http://localhost:16686
Figure 7: Jaeger dashboard
Jaeger is now running in the background and collecting data, so select “productpage.default” and click on “Find Traces” at the bottom of the drop-down menu.
Figure 8: Jaeger dashboard with productpage
The top visualization shows you the average response time of an end-to-end response for different periods.
Figure 9: Jaeger Dashboard visualization for average response time
Now that you understand how to send metrics to Grafana and Jaeger, let’s shift gears and look at AWS Distro for OpenTelemetry (ADOT).
AWS and OpenTelemetry
AWS now offers AWS Distro for OpenTelemetry (still in preview phase). AWS is one of the upstream contributors to the OpenTelemetry project and tests, secures, optimizes, and supports various components of the project like SDKs, agents, collectors, and auto-instrumentations. The initial release supports languages including Python, Go, Java, and JavaScript; other languages will be included in upcoming releases. On top of that, you don’t have to pay to use AWS Distro for OpenTelemetry—just for the traces, logs, and metrics sent to AWS. Using AWS Distro, you need to instrument your application only once and can then send correlated metrics and traces to multiple monitoring solutions, such as CloudWatch, X-Ray, Elasticsearch, and Partner solutions. With the help of auto-instrumentation agents, you can collect traces without needing to change your code, plus the solution gathers metadata from your AWS resources, which helps correlate application performance data with the underlying infrastructure data, resolving problems faster. Currently, AWS Distro for OpenTelemetry supports instrumenting your application for on-premises as well as the following AWS services: Amazon Elastic Kubernetes Service (EKS) on EC2, AWS Fargate, Elastic Compute Cloud (EC2), and AWS Fargate.
Various Components
AWS Distro for OpenTelemetry consists of the following components:
The OpenTelemetry SDK allows for the collection of metadata for AWS-specific resources, such as Task and Pod ID, Container ID, and Lambda function version. It can also correlate trace and metrics data from both CloudWatch and AWS X-Ray.
The OpenTelemetry Collector is responsible for sending data to AWS services like AWS CloudWatch, Amazon Managed Service for Prometheus, and AWS X-Ray.
AWS also supports an OpenTelemetry Java auto-instrumentation agent for tracing data from AWS SDKs and X-Ray. For all these components, AWS also contributes back to the upstream project.
Serverless and OpenTelemetry
AWS Distro for OpenTelemetry currently only supports Python based on Lambda Extensions. First, you need to build your Lambda layer containing the OpenTelemetry SDK and Collector, which you can then add to your Lambda function. Once this is done, AWS takes care of auto-instrumentation and initializes the instrumentation of dependencies, HTTP clients, and AWS SDKs. It also captures resource-specific information, such as Lambda function name, Amazon resource name (ARN), version, and request-ID.
Requirements
There are a couple of installations required before building the Lambda layer:
- AWS SAM CLI: Refer to the following doc to install per your given platform.
AWS CLI: Refer to the following doc to install per your given platform; this is needed to configure AWS credentials and requires administrator access.
Note: Currently Lambda layer only supports Python 3.8 Lambda runtimes.
Building the Lambda Layer
Once you meet all the prerequisites, the next step is to build the Lambda layer. Here, you’ll have the AWS Distro for OpenTelemetry Collector (ADOT Collector), run as a Lambda extension; your Python function will also use this layer. For this example, you’ll use the aws-otel-lambda repository. First, clone the repo:
git clone https://github.com/aws-observability/aws-otel-lambda.git
Then go to the sample-apps directory:
cd sample-apps/python-lambda
To Publish the layer, run the command below:
./run.sh
running...
Invoked with:
sam building...
SAM CLI now collects telemetry to better understand customer needs.
You can OPT OUT and disable telemetry collection by setting the
environment variable SAM_CLI_TELEMETRY=0 in your shell.
Thanks for your help!
--------------------------Output Cut -------------------------------
Successfully created/updated stack - adot-py38-sample in us-west-2
ADOT Python3.8 Lambda layer ARN:
arn:aws:lambda:us-west-2:XXXXXXX:layer:aws-distro-for-opentelemetry-python-38-preview:1
If you want to publish the layer in a different region, e.g., to us-east-2, run the run.sh command with the -r parameter:
./run.sh -r us-east-2
Auto-Instrumentation for Your Lambda Function
Once you push the Lambda layer, you need to follow a series of steps to enable auto-instrumentation. First, go to the Lambda console and select the function you want to instrument. Scroll down and click on “Add a layer.”
Figure 10: Lambda console for adding a layer
Select “Custom layers,” and from the drop-down, choose the layer you created earlier and Version 1. Click on “Add.”
Figure 11: Lambda console for adding a custom layer
Now, go back to your Lambda function and click on “Configuration,” then “Environment variables.” Select “Edit” and “Add environment variable.”
Figure 12: Lambda console for adding an environment variable
Add AWS_LAMBDA_EXEC_WRAPPER with value /opt/python/adot-instrument. This will enable auto-instrumentation. Click on “Save.”
Figure 13: Lambda console for adding environment variable AWS_LAMBDA_EXEC_WRAPPER
Also, make sure that “Active tracing” is enabled under “Monitoring and operations tools.”
Figure 14: Lambda console for enabling active tracing
By default, AWS Distro for OpenTelemetry exports telemetry data to AWS X-Ray and CloudWatch. For the latter, go to the CloudWatch console and click on “Traces.”
Figure 15: CloudWatch dashboard with traces
To retrieve information about specific traces, click any of the Lambda functions and then trace ID.
Figure 16: CloudWatch dashboard with specific trace
And to drill down even further, go to the X-Ray console and click on “Analytics.”
Figure 17: AWS X-Ray with specific trace
Wrapping Up
OpenTelemetry is still an evolving project, and with the launch of products like AWS Distro for OpenTelemetry, fully backed by AWS, it’s heading toward stability. Currently, AWS Distro for OpenTelemetry only supports Python for Lambda, but other languages (Node.js, Java, Go, .NET) will be coming soon. Also, you need to create your own Lambda layer manually in the current state, but in the future, AWS will automate and manage this process. Epsagon is tightly integrated with AWS and provides full visibility into how your serverless application is performing. Onboarding your new or existing application is straightforward and doesn’t require any complex configuration. It also provides a visualization dashboard that helps detect bottlenecks and overall system health, predicts the overall cost, and offers other helpful insights based on collected data and metrics. Another advantage is that Epsagon correlates all the aggregated data, which is vital in distributed architectures using AWS Lambda and other serverless services. Plus, Epsagon includes auto-instrumentation for languages like Python, Go, Java, Ruby, Node.js, PHP, and .NET, reducing the time it takes to instrument tracing. Check out our demo environment or try for FREE for up to 10 Million traces per month!
Get emerging insights on innovative technology straight to your inbox.
Driving productivity and improved outcomes with Generative AI-powered assistants
Discover how AI assistants can revolutionize your business, from automating routine tasks and improving employee productivity to delivering personalized customer experiences and bridging the AI skills gap.
Related articles
The Shift is Outshift’s exclusive newsletter.
The latest news and updates on generative AI, quantum computing, and other groundbreaking innovations shaping the future of technology.