I am trying to install OpenTelemetry in our Staging Environment, the collector being in DaemonSet Mode.
My goal is to create a Java Instrumentation customer resource, which defines the configuration for OpenTelemetry SDK and instrumentation. The instrumentation will be enabled when an Instrumentation CR is present in the cluster and a namespace or workload is annotated (I chose the namespace to be annotated, since i wanted to avoid annotating each deployment template), this will export metrics/traces to the collector, which exports the data to New Relic Backend.
Issue: when everything is set up, and i add multiple containers to be injected with java-instrumentation, the injected containers export the metrics/traces to New Relic as expected, but its blocking any new resource creation in the namespace, giving below error when i check events.
Error creating: Pod "<random pod in namespace>" is invalid: spec.containers[0].volumeMounts[13].mountPath: Invalid value: "/otel-auto-instrumentation": must be unique
The steps to reproduce are below.
I am trying to install OpenTelemetry in our Staging Environment, the collector being in DaemonSet Mode.
- Installing Operator Using standard process as described in documentation via helm charts (https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-operator)
helm install --namespace staging \
opentelemetry-operator open-telemetry/opentelemetry-operator
which creates opentelemetry-operator
deployment running 1 pod with 3 running containers (istio-proxy,manager,kube-rbac-proxy)
- On Second Step, I install Collector via below config
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: otlp-collector-daemonset
namespace: staging
spec:
mode: daemonset
hostNetwork: true
serviceAccount: otel-serviceaccount
env:
- name: KUBE_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
k8sattributes:
filter:
node_from_env_var: KUBE_NODE_NAME
namespace: staging
exporters:
otlp:
endpoint: https://<NR_endpoint>:4317
headers:
"api-key": <NR_API_KEY>
logging:
verbosity: detailed
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [logging,otlp]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [logging,otlp]
which successfully creates our otlp-collector-daemonset-collector
daemoneset, running 1 pod on each node in the namespace (running 1 container named otc-container
on each pod)
- Install Auto-instrumentation The instrumentation is enabled when an Instrumentation CR is present in the cluster and a namespace or workload is annotated, so i first install JAVA instrumentation CR as per below
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: java-instrumentation
namespace: staging
spec:
exporter:
endpoint: http://otlp-collector-daemonset-collector:4317
propagators:
- tracecontext
- baggage
- b3
sampler:
type: always_on
java:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
kubectl -n staging apply -f manifests/instrumentations/java_instrumentation.yaml
- Now, annotate the staging namespace.
kubectl -n staging patch namespace staging -p '{"metadata":{"annotations":{"instrumentation.opentelemetry.io/inject-java":"true"}}}'
When i run try to rollout restart the collector daemonset kubectl -n staging rollout restart daemonset otlp-collector-daemonset-collector
it only restarts one pod, and not all (and the new pod is now with two containers, otc-container
and init container opentelemetry-auto-instrumentation
)
When we rollout restart any deployment in the namespace, it comes up with new init container opentelemetry-auto-instrumentation
(as expected)
Since, the auto-instrumentation is injected in the first container of the pod by default, we need to specify the container names to inject auto-instrumentation to. we do this via annotating the namespace.
kubectl -n staging patch namespace staging -p '{"metadata":{"annotations":{"instrumentation.opentelemetry.io/container-names":"container_2_from_deployment_x,conatiner_2_deployment_y"}}}'
And then, i rollout restart the two deployments containing these pods, the rollout happens fine, and they also start exporting the data to New Relic (As Expected).
However, this is where the problem arises, its blocking any new resource creation in the namespace, with error
Error creating: Pod "<some randome pod in namespace>" is invalid: spec.containers[0].volumeMounts[13].mountPath: Invalid value: "/otel-auto-instrumentation": must be unique
this is fixed by putting only 1 conatiner in the "instrumentation.opentelemetry.io/container-names"
annotation.
I think this might be because the containers have to belong to same pod ? If yes, what should be the workaround? (my goal is to annotate namespace to auto-instrument all Java applications at once, instead of adding annotations individually to deployment templates.)