Performance profiling of Windows Apps. Better alternatives for Visual Studio Profiler?

Question

I am impressed with Visual Studio Profiler for performance analysis. Fast for my purposes and easy to use.

I am just curious to know about the caveats in visual studio profiler. Are there any better profilers for windows applications which fare better for these caveats?

Caveat #1 for me: The price tag on Visual Studio editions that come with a profiler! :) — Adrian Grigore, Aug 26 '10 at 06:24
See my answer on http://stackoverflow.com/questions/2302425/profiling-c-code-on-windows-when-using-eclipse for a list of Windows profilers. — Patrick, Aug 26 '10 at 06:56
@Adrian Grigore: I agree. :-) @Patrick: I have used very sleepy. Yes it is quite handy. But only does sampling. — rpattabi, Aug 26 '10 at 07:01
Ditto Patrick's comment, and when you say "only does sampling", sampling (of the stack, on wall-clock time, not CPU time, under control of the user, and reporting per line, not function, percent of time, not actual time or counts) is the **right** thing to do, in spite of all the noise out there, much of it coming from Redmond. — Mike Dunlavey, Aug 26 '10 at 15:23

Mike Dunlavey · Answer 1 · 2010-08-26T16:47:17.570

On the positive side, nobody makes great apps like Microsoft. Visual Studio is a fine product, and its profiler shares those attributes.

On the other hand, there are caveats (shared by other profilers as well).

In sampling mode, it doesn't sample when the thread is blocked. Therefore it is blind to extraneous I/O, socket calls, etc. This is an attribute that dates from the early days of prof and gprof, which started out as PC samplers, and since when blocked the PC is meaningless, sampling was turned off. The PC may be meaningless, but the stack tells exactly why the thread is blocked and, when there is much time going into that, you need to know it.
In instrumentation mode, it can include I/O, but it only gives you function-level percent of time, not line level. That may be OK if functions happen to be small, or if they only call each other in a small number of places, so finding call sites is not too hard. I work with good programmers, but our code is not all like that. In fact, often the call sites are invisible, because they are compiler-inserted. On the other hand, stack samples pinpoint those calls no matter who wrote them.

The profiler does a nice job of showing you the split between activity of different threads. Then what you need to know is, if a thread is suspended or showing a low processor activity, is that because it is blocking for something that it doesn't really have to? Stack samples could tell you that if they could be taken during blocking. On the other hand, if a thread is cranking heavily, do you know if what it is doing is actually necessary or could be reduced? Stack samples will tell you that also.

Many people think the primary job of a profiler is to measure. Personally, I want something that pinpoints code that costs a lot of time and can be done more efficiently. Most of the time these are function call sites, not "hot spots". I don't need to know "a lot of time" with any precision. It I know it is, say, 60% +/- 20% that's perfectly fine with me because I'm looking for the problem, not the measurement. If because of this imprecision, I fix a problem which is not the largest, that's OK, because when I repeat the process, the largest problem will be even bigger, as a percent, so I won't miss it.

oh there i see the seed for flame. "nobody makes great apps like Microsoft"?? — rpattabi, Aug 29 '10 at 02:15
@ragu.pattabi: I don't consider it a flame. (I'm flame-capable.) I do think Microsoft makes great apps. At the same time they are not immune to widely-held misconceptions, at least on the subject of performance. Here is a list of those misconceptions, and a very clear positive statement of how to overcome them: http://stackoverflow.com/questions/1777556/alternatives-to-gprof/1779343#1779343 — Mike Dunlavey, Aug 29 '10 at 16:35
@ragu.pattabi: If I can put it in a nutshell, the common conception is that performance problems to be optimized consist of code where the program counter spends a lot of time, and clever detective work is needed to find them. Wrong on both counts. The larger software is, the more likely the code to prune is branches of the call tree, not leaves, and wall-time stack sampling is such an easy way to find them it requires no cleverness at all. — Mike Dunlavey, Aug 29 '10 at 16:48

Performance profiling of Windows Apps. Better alternatives for Visual Studio Profiler?

1 Answers1