Java Profilers That Display Per Request Statistics and Program Flow

Question

I'm looking for profilers that support per request profiling statistics, ideally along the programs flow (not the usual thread call stack). So basically a profiler call stack + sequential calls view for each single request, something like this:

doGet                 100ms
+ doFilter             95ms
  + doFilter2          90ms
    + validateValues   20ms
    + calculateX       40ms
      + calc1          10ms
      + calc2          30ms
    + renderResponse   30ms

Which classes / methods are profiled is configured in some way, for a tracing profiler that processes each method call, this is of course not usable.

I know of and have used dynaTrace, its "PurePath" Feature (http://www.dynatrace.com/en/architecture-tame-complexity-with-purepath.aspx) supports this, but I'm looking for tools that are usable in smaller projects and require less of an initial investment and set up.

Does any "classical" profiler (YourKit, etc) support this and I have overlooked the feature?

Addendum: To supply some background: The main goal is to have statistics for the monitoring and analysis of a system in production. First and foremost the idea is to get live statistics of how long requests take and in case response time goes up to have data for certain (types of) requests (think JETM + x).

Per request profiling statistics allows for a detailed analysis of why just some requests are slow, e.g. if 10% of the requests take ten times as long as your average. With aggregated statistics this is AFAIK very hard to solve.

The same goes for a profiling statistics that renders the calls along the program flow, because it is easy to identify where in the request the problem lies, e.g. a method performs ten DB queries, you see each call as a single one and not just ten aggregated calls.

Ideally the measurement points are configure and en/disabled at run time.

This approach is rarely supported, because it is prone to measurement errors and because it has quite a high overhead. Furthermore, it simply does not tell you the truth, a very fast call might trash cashes, prevent garbage collections collection of vast amounts of objects, or force slow path synchronization in other calls. What is it that you really want to achieve? — jmg, May 31 '11 at 09:22
I would break the problem into two parts, overall bandwidth (number of transactions per time unit), and stack sampling to see which routines and/or lines of code have high inclusive time, on a percent basis, not absolute time. Performance is not affected much, because you don't need high-frequency sampling. [Here's an explanation.](http://stackoverflow.com/questions/5525758/function-profiling-woes-visual-studio-2010-ultimate/5560023#5560023) — Mike Dunlavey, May 31 '11 at 21:26

score 1 · Answer 1 · answered May 31 '11 at 08:09

1

If your application is timing milli-seconds, you could just have a map of time to stage TreeMap which you can summaries and write to a file. This is the most flexible and is fine for milli-second timings.

For micro-second timings, I have an enum value for each stage and then record the current time (System.nanoTime()) when that staging is reached in a ThreadLocal array. (No object allocation) When the request is finished, write the timing deltas to a file e.g. CSV format.

answered May 31 '11 at 08:09

Peter Lawrey

525,659
79
751
1,130

I forget to mention this in my initial question, ideally some kind of online configuration of the measured methods should be possible, to dig deeper on demand. – Elmar Weber May 31 '11 at 16:32
The only way to do that is via instrumentation. Once you have benchmarked your application a few times you will know where your key stages are. Its not something you should need to change on a regular basis. – Peter Lawrey May 31 '11 at 17:02

score 1 · Answer 2 · answered May 31 '11 at 08:16

1

My approach is similar to Peter's but instead of using threadlocals and computing online I write to the log file when the execution reaches interesting stages. Also, I used AspectJ to generate the log lines, which I found very convenient for adding/removing log lines at whim without having to change rest of source code.

answered May 31 '11 at 08:16

Miserable Variable

28,432
15
72
133

In general the idea of a custom solution has a nice ring to it, maybe even extending something existing like JETM. The point I'm currently curious about though is why there seems no existing tool except one. This can mean a) there is no need for it and I don't know of another solution or b) it is not easy to implement. – Elmar Weber May 31 '11 at 16:41

score 1 · Accepted Answer · answered Jun 01 '11 at 07:50

1

You could try btrace to do selective measurements. It is somewhat similar to dtrace, which you could also use if you are on a supported platform, Solaris, BSD, OS X.

answered Jun 01 '11 at 07:50

jmg

7,308
1
18
22

Java Profilers That Display Per Request Statistics and Program Flow

3 Answers3