Questions tagged [distributed-tracing]

Distributed Tracing aims to provide better observability into distributed systems and microservices for purposes of performance monitoring and troubleshooting issues.

Distributed Tracing

Distributed Tracing aims to provide better observability into distributed systems and microservices for purposes of performance monitoring and troubleshooting issues.

Modern Internet services are often implemented as complex, large-scale distributed systems. These applications are constructed from collections of software modules that may be developed by different teams, perhaps in different programming languages, and could span many thousands of machines across multiple physical facilities. Tools that aid in understanding system behavior and reasoning about performance issues are invaluable in such an environment.

Source: Dapper, a Large-Scale Distributed Systems Tracing Infrastructure

How it works in a nutshell

Distributed Tracing works by collecting the various entry and exit points and useful intermediate data and metrics done by a request until the final response is served to the requesting end. Some Distributed Tracing systems collect this information fully automatic while some other require manual instrumentation of code.

When entering a system, the request is usually assigned a unique Trace ID. This ID is then propagated to any participating systems. Information gathered this way is sent to some sort of backend collecting the data. The collector then aggregates the data via the Trace ID, thus showing the full request as it passed through the distributed system.

Metrics usually included are request time, latency, errors, status codes, etc. but not limited to this.

Open Source implementations:

Several Open Source implementations for Distributed Tracing exist:

  • http://opencensus.io

    A single distribution of libraries for metrics and distributed tracing with minimal overhead that allows you to export data to multiple backends.

  • http://opentracing.io

    Vendor-neutral APIs and instrumentation for distributed tracing

  • http://zipkin.io

    Zipkin is a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in microservice architectures. It manages both the collection and lookup of this data.

  • http://www.jaegertracing.io

    Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing system released as open source by Uber Technologies. It is used for monitoring and troubleshooting microservices-based distributed systems

There is also a W3 working group aiming to standardize context propagation across various Distributed Tracing systems:

Because Distributed Tracing is crucial for application performance monitoring, most APM vendors adopted it in one way or another. Notable APM vendors offering Distributed Tracing are AppDynamics, DynaTrace, Instana, Lightstep or New Relic.

219 questions
18
votes
5 answers

How to configure Jaeger with elasticsearch?

I have tried executing this docker command to setup Jaeger Agent and jaeger collector with elasticsearch. sudo docker run \ -p 5775:5775/udp \ -p 6831:6831/udp \ -p 6832:6832/udp \ -p 5778:5778 \ -p 16686:16686 \ -p 14268:14268 \ -e…
11
votes
0 answers

Logs are not received in Hawkular APM from Zipkin Client

I have client application instrumented with Zipkin library with configuration in spring application.properties . camel.zipkin.host-name=hawkular-apm-server.com camel.zipkin.port=443 camel.zipkin.include-message-body-streams=true Maven dependency …
jack
  • 803
  • 3
  • 15
  • 26
9
votes
2 answers

Disable distributed tracing for development

We are setting up microservice framework. We use following stack for distributed tracing. Spring boot Kafka Zipkin Following is how the configuration is done In gradle.build (or pom.xml) following starter dependencies added compile…
8
votes
1 answer

Difference between Opentracing and W3C Trace Context (with respect to headers)

The W3C trace context defines the traceparent and tracestate headers for enabling distributed tracing. My question(s) is then How is it different from OpenTracing. If W3C has already defined usage of the headers, then is opentracing using some…
Tiju John
  • 933
  • 11
  • 28
6
votes
0 answers

AWS XRay service map components are disconnected

I'm using open telemetry to export trace information of the following application: A nodejs kafka producer sends messages to input-topic. It uses kafkajs instrumented with opentelemetry-instrumentation-kafkajs library. I'm using the example from…
Majid Azimi
  • 5,575
  • 13
  • 64
  • 113
6
votes
1 answer

Difference between Zipkin and Elastic Stack(ELK)?

Spring Cloud Sleuth is used for creating traceIds (Unique to request across services) and spanId (Same for one unit for work). My idea is that Zipkin server is used to get collective visualization of these logs across service. But I know and have…
6
votes
1 answer

Advantage of opentracing/jaeger over APM tracing capabilities

I was looking at APM tools. Essentially Dynatrace and I could see that it also provides tracing capabilities that seem to be language agnostic and also without code modifications. Where would jaeger/open tracing be a better option than a tool like…
Vipin Menon
  • 2,892
  • 4
  • 20
  • 35
6
votes
2 answers

Logback MDC on Netty or any other non-blocking IO server

Logback MDC (Mapped Diagnostic Context) is leveraging threadLocal (As far as I know) so that it will be accessible on all the log statements executed by the same thread. My question is, will logback MDC work in the non blocking IO server-side…
so-random-dude
  • 15,277
  • 10
  • 68
  • 113
5
votes
1 answer

Distributed tracing using Jaeger with correct hierarchy

I am new to Jaeger and I would like to use it in order to record traces for my microservices. I create traces from my μservices, providing the traceId publish them as messages and consume them in another service in order to export the trace to…
5
votes
1 answer

How to tracing a request through a chain of microservices end-to-end?

I am using OpenCensus in Go to push tracing data to Stackdriver for calls involving a chain of 2 or more micro services and I noticed that I get many traces which contain spans only for certain services but not the entire end to end call. At the…
4
votes
1 answer

in Opentelemetry, not able to get parent span

I am new to OpenTelemetry word. I have created spans for my services separately, but when i am try to combine spans of two different services, using context propogation, I am not able to do it successfully. I have used following code: // at client…
MayurMore
  • 41
  • 1
  • 3
4
votes
2 answers

Can envoy in istio trace external https api?

We use istio to use distributed tracing. Our microservices sometimes need to hit external APIs, which usually communicate over https. To measure the exact performance of the whole system, we want to trace the communication when hitting an external…
yu saito
  • 125
  • 7
4
votes
1 answer

How to secure the Jaeger UI from a keycloak security proxy (login)

After login to the Keycloak Jaeger(realm) client, the keycloak server doesn't navigate to the Jaeger UI path -> localhost:16686. Request URL:…
3
votes
1 answer

different trace-id for each api request

My angular SPA application is calling a back end api which in turn can call multiple apis. To see the end-to-end trace, we are using application insights sdk "@microsoft/applicationinsights-web": "^2.5.4" enabling W3C tracing mode. The issue is that…
3
votes
0 answers

Distributed tracing doesn't work Jaeger+OpenTelemetry

I am trying to implement distributed tracing with basic GO client-server app. Using default Jaeger docker-compose all-in-one. What was done to fix and doesn't help: Changed collector to agent and agent to collector. Checked logs, nothing about…
IvanSpk
  • 137
  • 1
  • 12
1
2 3
14 15