1

I am trying to do a little wall-time profiling.

GCC adds certain runtime instrumentation code (eg for GProf) when compiling with -pg.

I assume it stores that information in some global or thread-local datastructure before writing it to gmon.out?

Is it possible to read that information (stacktrace) from another thread within the code itself? (If that is so, then I could add my wall-time profiling thread without having to add the instrumentation myself.)

Bernd Elkemann
  • 23,242
  • 4
  • 37
  • 66
  • Possible duplicate of [print call stack in C or C++](https://stackoverflow.com/questions/3899870/print-call-stack-in-c-or-c) – rogerdpack Aug 29 '18 at 17:47

1 Answers1

2

gprof does not take stack traces, and it works on CPU-time, not wall-time. It just samples the program counter, on CPU-time, and attributes it to functions it knows about. Its main claim to fame, compared to previous profilers, is that since PC-only ("self time") sampling is pretty useless in decent-sized apps where the call-stack is many layers deep, it also counts how many times any function A calls any function B. Then it tries to guess (by some pretty shaky math) how much CPU time can be charged back to the higher-level routines that are invoking the lower-level routines.

There are profilers that take stack traces on wall-time. (CPU-time means if your app is somehow blowing time at a very low level by sleeping, I/O-ing, hanging on a semaphore, or some other blocking, you will never see it.) I know of one that stack-samples on wall-time, namely Zoom. I'm told OProfile can do it, but I can't verify it. Same for DTrace.

But that's just talking about the front end, the taking of samples.

Just as important is the back end, the part that presents stuff to you. Typically you get "hot paths", "call graphs", "flame graphs", etc. etc.

Personally, I take a jaundiced view of all these spiffy toys. What they do, they do well, no question. But if speedup results are what is needed, then the best information comes from a small number of stack samples, taken at the time you care about, that are actually looked at and understood, not just summarized.

There is no summarizer that recognizes patterns better than the head of a programmer, and any problem big enough to be worth fixing will be evident in a small number of samples.

Here's an example, and here's another, and if you want to see some real math behind it, look here.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
  • That is all good information but not specific to my question. Do you know how I can use the information `-pg` generates in my own code? – Bernd Elkemann Apr 19 '14 at 22:42
  • 1
    @eznme: You said you're trying to do wall-time profiling. gprof doesn't do wall-time, only CPU-time. You ask if it's possible to read its information (stacktrace). gprof does not take stacktraces, only PC-samples. To do what you want, you need something else besides gprof. I've tried to explain what *does* work, in case that's useful. – Mike Dunlavey Apr 20 '14 at 00:47