Looking for a low impact c++ profiler

Question

I am looking for a low impact, os-independent profiler for c++ code.

When I say low impact, I am referring to something less intrusive than valgrind. I plan to use it in a MIPS-based embeded environment (hence the os-independance) and tried a ported version of valgrind and it completely changed the performance characteristics (way too much Heisenberg principle at work) so I cant go that route. We know the memory bus speed is a bottleneck which most-likely explains why valgrind was so intrusive.

I have created a home grown type of profiler based on checkpoints that lets me measure certain parts of the code. Basically I have to modify the code (and recompile) to set checkpoints in strategic places in the code. Then, when executed, it stores the number of times each checkpoint is hit and the time since the last checkpoint was hit. Then, after running it, I can dump the checkpoints and for each it calculates: num-hits, max-time, min-time, avg-time, etc.

This profiler (I called it LowImpactProfiler) works ok, but I wonder if there is something better out there.

Ive considered oProfile, which is a sampling profiler, but since Im not running Linux, I think it will be really dificult to implement.

If the goal is to find out what in the code is causing slowness and might be improved to get better performance *[you could give this a try](http://stackoverflow.com/questions/375913/what-can-i-use-to-profile-c-code-in-linux/378024#378024).* — Mike Dunlavey, May 03 '12 at 15:11
@Matthieu, sorry, its Heisenburg ;) I think it comes mainly from chemistry and states basically that you cant observe something (electrons) without affecting it. In this context its hard to profile code without affecting its performance. — Brady, May 03 '12 at 15:46
@Matthieu, ya I got it :) I did it again in my comment, its Heisenberg. — Brady, May 03 '12 at 16:12
@Brady: and the "type" of bug is *Heisenbug*. The worst kind... — Matthieu M., May 03 '12 at 16:20
@Brady: *Heisenbug* is often brought up as the *bete noire* of profiling, but it need not be an issue. You can distinguish between *measuring* performance, and *finding* performance problems. Don't think of measuring as the method of finding. Finding does not require measuring. A very small number of samples gives you very imprecise measurement, but at the same time it gives you very good indication of what to fix. — Mike Dunlavey, May 03 '12 at 16:50
@Brady: If I can just expand on that a bit. Suppose you take 6 random-time samples, and 3 of them show a problem you know you could fix. The estimated savings is about 0.5, but it could be a lot less, obviously. In fact it's a beta(4,4) distribution. There is a 3% chance the savings is less that 0.2, and a 3% chance it is greater than 0.8. This means on average the speedup will be 2x, but there's a 94% chance it is between 1.25x and 5x. I'll take those odds, and meanwhile the samples told *precisely* what the problem is. — Mike Dunlavey, May 03 '12 at 19:15
@MikeDunlavey I prefer to use Amdahls law to determine the possible gain I could achieve. http://en.wikipedia.org/wiki/Amdahls_law If I know a certain part of the code is slow, and I know the overall percentage of time its consuming, then with this law I can get a general idea of how much theoretical gain (best case) to expect by improving that part. This also helps to determine if its worth it to try to improve certain areas. — Brady, May 04 '12 at 14:03
@Brady: All Amdahl's law says, with various algebra, is speedup S = 1/(1-X) where X is the overall fraction of time saved. It says nothing about the uncertainty of X as a function of number of samples. But the real point is you are looking for a certain part or area of the code that is slow, so a profiler will miss causes of slowness that are not revealed in the time taken by particular functions. And if you dismiss those, they will end up being dominant. Samples that are actually examined and understood will find them. — Mike Dunlavey, May 04 '12 at 15:31

score 7 · Accepted Answer · answered May 03 '12 at 14:59

7

I've used Shiny to profile on very limited embedded devices with great success. From your description, it takes a similar approach to your LowImpactProfiler.

answered May 03 '12 at 14:59

Miguel Grinberg

65,299
14
133
152

I'll download it this week and take a look, Gracias desde Madrid! – Brady May 05 '12 at 15:41
Ive taken a look at the Shiny code and I like it so far, it seems to be a better solution than what I currently have. Thanks! – Brady May 07 '12 at 07:32
It looks promising, yet I doubt that this will work on very low power embedded devices. – Alexander Oh Dec 04 '12 at 12:26

score 3 · Answer 2 · answered May 03 '12 at 15:54

3

If you are using Windows, you can try my profiler, described here http://ravenspoint.wordpress.com/2010/06/16/timing/

It sounds like it might be easier to use than yours, but it is not OS independent. It uses calls to QueryPerformanceCounter() which is a windows API. It is open source, so it might be worthwhile to port it to your OS, using whatever high performance timer is available there.

answered May 03 '12 at 15:54

ravenspoint

19,093
6
57
103

Thanks, but we wont be using windows, its a telecom app and if anything, it may someday be ported to a unix platform. I'll take a look anyways. – Brady May 03 '12 at 15:57
+1, I like the API that this uses, its *very* similar to my checkpoints. I'll look at the code and see about porting/merging it with mine. I'll prepare mine and put it on github next week for you to look at. I also have scoped checkpoints, or can set individual checckpoints without needing to be in scopes. When I submit my code, I'll post a msg here and/or send you a msg on your website. Thanks! – Brady May 05 '12 at 11:16

Looking for a low impact c++ profiler

2 Answers2

Linked