3

I've seen a few tools like Pin and DynInst that do dynamic code manipulation in order to instrument code without having to recompile. These seem like heavyweight solutions to what seems like it should be a straightforward problem: retrieving accurate function call data from a program.

I want to write something such that in my code, I can write

void SomeFunction() {
  StartProfiler();
  ...
  StopProfiler();
}

and post-execution, retrieve data about what functions were called between StartProfiler() and StopProfiler() (the whole call tree) and how long each of them took.

Preferably I could read out debug symbols too, to get function names instead of addresses.

nornagon
  • 15,393
  • 18
  • 71
  • 85
  • 1
    If you can use Linux, try callgrind (from valgrind) to get call tree. It will dynamically translate and instrument your program and show all function calls in tree format (+check kcachgrind GUI). There are also some system-wide profilers capable of drawing call-trees (like linux pref or google-perftools), but their callgraph is taken only at some sampling interval (e.g. every 1 ms) and is inaccurate. – osgx Sep 07 '12 at 13:48
  • I'm on OS X, but I think (parts of) valgrind have been ported there recently? – nornagon Sep 07 '12 at 17:39
  • valgrind is OSX 10.6/10.7 only; 10.8 support is limited. Also, callgrind has no "START"/"STOP" macro (only `--instr-atstart=no` and callgrind_control utility) , and it will draw summary a tree without dumping full calltrace (but internally it has one). Also, check this thread http://stackoverflow.com/questions/311840/tool-to-trace-local-function-calls-in-linux if you want to get trace of all calls (most answers needs only scripting and gdb). I usually use this solution: http://blog.superadditive.com/2007/12/01/call-graphs-using-the-gnu-project-debugger/ – osgx Sep 07 '12 at 22:01

1 Answers1

3

Here's one interesting hint at a solution I discovered.

gcc (and llvm>=3.0) has a -pg option when compiling, which is traditionally for gprof support. When you compile your code with this flag, the compiler adds a call to the function mcount to the beginning of every function definition. You can override this function, but you'll need to do it in assembly, otherwise the mcount function you define will be instrumented with a call to mcount and you'll quickly run out of stack space before main even gets called.

Here's a little proof of concept:

foo.c:

int total_calls = 0;
void foo(int c) {
  if (c > 0)
    foo(c-1);
}
int main() {
  foo(4);
  printf("%d\n", total_calls);
}

foo.s:

.globl mcount
mcount:
  movl  _total_calls(%rip), %eax
  addl  $1, %eax
  movl  %eax, _total_calls(%rip)
  ret

compile with clang -pg foo.s foo.c -o foo. Result:

$ ./foo
6

That's 1 for main, 4 for foo and 1 for printf.

Here's the asm that clang emits for foo:

_foo:
  pushq %rbp
  movq  %rsp, %rbp
  subq  $16, %rsp
  movl  %edi, -8(%rbp)          ## 4-byte Spill
  callq mcount
  movl  -8(%rbp), %edi          ## 4-byte Reload
  ...
nornagon
  • 15,393
  • 18
  • 71
  • 85
  • why do it in assembly, when we can write it in C and put to separate source file which is to be compiled without -pg. – osgx Sep 07 '12 at 13:42
  • That's a good idea :) Another possibility would be to write it in C and write a small assembly shim which would jump into the function after the generated `callq mcount`. – nornagon Sep 07 '12 at 17:37