2

So we have a setup of nginx ingress controller as reverse proxy for a django based backend app in production (GKE k8s cluster). We have used opentelemetry to trace this entire stack(Signoz being the actual tool). One of our most critical api is validate-cart. And we have observed that this api sometimes take a lotta time, like 10-20 seconds and even more. But if we look at the trace of one of such request in Signoz, the actual backend takes very less time like 100ms but the total trace starting from nginx shows 29+ seconds. As you can see from the screen shot attached.

enter image description here

And looking at the p99 latency, the nginx service has way bigger spikes than the order. This graph is populated for the same validate-cart api.

enter image description here

Have been banging my head around this for quite some time and i am still stuck. I am assuimg there might be some case of request being queued at either nginx or django layer. But i am trusting otel libraries that are used to trace django, to start the trace the moment it hit django layer and since there isn't a big latency at django layer, issue might be at nginx layer. I have traced nginx using the third party module open-tracing since ingress-nginx doesn't yet support opentelemetry where more tracing information about nginx layer is provided.

Nginx opentracing

Dipendra bhatt
  • 729
  • 5
  • 14

0 Answers0