Exploring the eBPF-based OpenTelemetry Auto-Instrumentation Library for Go
I discovered opentelemetry-go-instrumentation, a library that enables automatic OpenTelemetry instrumentation for Go. It leverages eBPF. Let me walk through running and briefly examining this library.
opentelemetry-go-instrumentation is Work in Progress (v0.8.0-alpha). Please note that it may have changed significantly since then.What is Auto-Instrumentation?
In distributed tracing centered on OpenTelemetry, you need to add instrumentation to your application to propagate Context and output Metrics, Logs, and Traces. This work requires modifying application code and is tedious. To automate this instrumentation – that is, to achieve it without modifying application code – solutions have been implemented for languages like Java1.
However, unlike Java or Python, Go is natively compiled to machine code. This means you cannot add code at runtime. I had assumed that auto-instrumentation for Go would be difficult2.
The opentelemetry-go-instrumentation library introduced here attempts to achieve auto-instrumentation for Go using eBPF.
What is eBPF?
eBPF is a Linux technology that safely runs user-defined programs in a sandboxed environment in kernel space. Research and development are active in the fields of Observability and Tracing, centered on Networking. I’ve briefly introduced some papers I’ve read, so feel free to check them out:
Personally, I associate eBPF primarily with container networking, especially around Cilium.
In opentelemetry-go-instrumentation, eBPF is used to attach to the running process’s code and variables.
Running It
Before diving deeper, let’s run it. There’s a getting-started guide3, so we’ll follow that.
Preparation
Create a kind k8s cluster & load the image:
$ kind create cluster --name=otel-go-inst
$ make docker-build
$ kind load docker-image otel-go-instrumentation --name=otel-go-inst
Deploy the application:
$ kubectl apply -k docs/getting-started/emojivoto/
namespace/emojivoto created
serviceaccount/emoji created
serviceaccount/voting created
serviceaccount/web created
service/emoji-svc created
service/voting-svc created
service/web-svc created
deployment.apps/emoji created
deployment.apps/vote-bot created
deployment.apps/voting created
deployment.apps/web created
Deploy Jaeger:
$ kubectl apply -f docs/getting-started/jaeger.yaml -n emojivoto
deployment.apps/jaeger created
service/jaeger created
$ kubectl port-forward svc/jaeger 16686:16686 -n emojivoto
Before Instrumentation
Let’s take a look at the application before instrumentation:
$ kubectl port-forward svc/web-svc 8080:80 -n emojivoto

It appears to be an emoji voting application. Even after sending some requests, no Traces are visible in Jaeger.

After Instrumentation
Deploy the instrumented version of the application:
$ kubectl apply -f docs/getting-started/emojivoto-instrumented.yaml -n emojivoto
deployment.apps/emoji configured
deployment.apps/voting configured
deployment.apps/web configured
The instrumentation container definition is here. It works by running with elevated privileges in the same Pod as the target container and sharing the process namespace4. Only a container definition is added – no changes are made to the application code or image.
After interacting with the application again, let’s check Jaeger5. At this point, I noticed the application was getting connection refused from Jaeger:
2023/11/30 15:41:05 traces export: Post "http://jaeger:4318/v1/traces": dial tcp 10.96.187.248:4318: connect: connection refused
Changing the image from jaegertracing/opentelemetry-all-in-one to jaegertracing/all-in-one fixed the issue, so I submitted a PR to upstream.


With just this, we can see quite detailed Traces. The application did feel a bit sluggish, though I’m not sure if that’s just my imagination. I’d need to measure properly to know for sure.
How It Works
Let’s read through the documentation6.
The operations that require instrumentation can be broadly divided into three:
- Read and write SpanContext7 from HTTP/gRPC requests and responses
- Create Spans
- Store SpanContext in eBPF Maps
The eBPF program analyzes the stack and CPU registers to access user code and variables. To read and write SpanContext from structures like http.Request, it needs to know the offset of that field within the structure. However, offsets change whenever the structure definition is modified. offsets-tracker analyzes these offsets and saves the information in JSON files organized by version and structure.
In step 1, offsets-tracker is used to read and write SpanContext from structures like http.Request and grpc.ClientConn.
In step 2, Spans are automatically created at appropriate points. For example, a Span is created when sending a gRPC request within an HTTP Server handler. This library also supports manually created spans. In that case, it updates the SpanContext.
In step 3, SpanContext is stored in eBPF Maps so that it can be used in other places within the same goroutine. For example, the SpanContext from a received HTTP request is stored in an eBPF Map and retrieved when sending the HTTP response. SpanContext updates are needed when a new Span is received in step 1 or when the current Span is updated in step 2.
In the current implementation, the eBPF Map uses the goroutine ID as the key and the SpanContext as the value. Therefore, sharing SpanContext across multiple goroutines is difficult. In the future, they are considering tracking the tree-structured dependencies between goroutines.
Additionally, timestamps need to be captured at Span start and end. uretprobes, which call eBPF code at the end of a function, apparently don’t work well with Go. Instead, return statements are detected and uprobes are placed just before them to call the eBPF code that collects the end timestamp.
Conclusion
The documentation was well-organized and very easy to research. This time I didn’t read through the implementation in detail, so I’d like to investigate further next time.
I think it would be extremely useful once development progresses and it can be used in more situations. On the other hand, I’m concerned about the brute-force nature of the approach, potential performance overhead, and the security implications of granting container privileges. I’ll continue following this project.
https://github.com/open-telemetry/opentelemetry-java-instrumentation ↩︎
Auto-instrumentation was also a challenge for PiCoP, which I was developing at the time ↩︎
https://github.com/open-telemetry/opentelemetry-go-instrumentation/tree/v0.8.0-alpha/docs/getting-started ↩︎
https://kubernetes.io/ja/docs/tasks/configure-pod-container/share-process-namespace/ ↩︎
A bot that votes randomly was also deployed, so I didn’t actually need to interact with the app manually ↩︎
https://github.com/open-telemetry/opentelemetry-go-instrumentation/tree/v0.8.0-alpha/docs ↩︎
Contains TraceID, SpanID, etc. ↩︎
