1

Is there a way to avoid Google Performance Tools listing files as "??:?", that is, failing to locate which file contains the function it is reporting on? How can I work out which library contains the function being called?

$ env LD_PRELOAD="/usr/lib/libprofiler.so.0" \
   CPUPROFILE=output.prof python script.py
$ google-pprof --text --files /usr/bin/python output.prof 
Using local file /usr/bin/python.
Using local file output.prof.
Removing _L_unlock_13 from all stack traces.
Total: 433 samples
 362  83.6%  83.6%      362  83.6% dtrsm_ ??:?
  58  13.4%  97.0%       58  13.4% dgemm_ ??:?
   1   0.2%  97.2%        1   0.2% PyDict_GetItem /.../Objects/dictobject.c
   1   0.2%  97.5%        1   0.2% PyParser_AddToken /.../Parser/parser.c
...

I am aiming to be able to profile the C code in a python package that has many compiled C extension modules. In the toy example above, what would I do to track down where "dtrsm_" is defined? If there are multiple loaded libraries that contain functions with that same name, is there any way to tell which version is being called?

benjimin
  • 4,043
  • 29
  • 48

2 Answers2

1

C/C++ won't compile if the same pre-processed sourcefile (e.g. with #includes expanded) contains duplicate definitions for the same symbol. (Note that in the case of C++, symbols are mangled, according to compiler-specific schemes, to incorporate the argument signature so as to facilitate overloaded functions, which could not otherwise be differentiated.)

The linker is only concerned with unresolved symbols (so there ought be nothings preventing multiple libraries concurrently calling their own respective internally-defined functions with coincident names). If a file invokes a declared but undefined function, and multiple available libraries implement that symbol, then the linker is free to choose (say by precedence in a search-path) which version gets substituted in. (Incidentally, this is the same mechanism by which profilers such as gperftools or hpctoolkit are able to inject themselves and alter the normal behaviour of another application.)

Since different libraries are mapped to separate pages of memory, it ought to be possible to identify (from memory addresses) which library contains the executing version of a function. Indeed, the GNU debugger can identify the library that code is contained by, even when it fails to name a function.

$      gdb python
(gdb)  run -c "from numpy import *; linalg.inv(random.random((1000,1000)))"
CTRL-C
(gdb)  backtrace
#0 0x00007ffff5ba9df8 in dtrsm_ () from /usr/lib/libblas.so.3
...
#3 0x00007ffff420df83 in ?? () from /.../numpy/linalg/_umath_linalg.so

Linux (or rather the GNU C library) provides the "backtrace" call (for getting a list of pointers from the call stack), and the "backtrace_symbols" call for automatically converting each of those pointers to a descriptive string such as:

"/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fc429929ec5]"

Gperftools can (judging from a query on the github mirror) call the generic "backtrace", but instead of "backtrace_symbols" it "forks out to pprof to do the actual symbolizing". This is a fairly-epic perl script, and looks likely where the "??" comes from.

Crucially, google-pprof is trying to report on the source-file (and line-number) which defines the function, not the binary-file containing the machine-code (that is typically quoted in stack traces). It invokes the "nm" utility. On my system it appears (by running "nm -l -D") that libblas, unlike libc and the python binary, has been stripped of such debugging symbols (presumably for optimisation), explaining the result.

To answer the original question: the call-stack samples should definitively and explicitly specify which version is being called. These can probably be dumped using an option which was added in google-pprof several months ago, or (for time-intensive functions) can be roughly ascertained by manual resampling using gdb. (It's even conceivable that g-pprof can be adjusted to explicitly identify the binaries paths in its output summaries.) Alternatively one can run "nm" (and grep) on the candidate binaries/libraries (of which a short-list can be obtained by running "strings" on the profiler's raw output, among other methods). If the source is accessible (to grep) or the libraries are popular (on the web) then of course (and per Mike Dunlavey) it may be easiest to just query for the function name. In theory the "??:?" may be addressed by carefully recompiling the offending objects.

benjimin
  • 4,043
  • 29
  • 48
  • +1 upvoted on general principles, but profiling is seldom an end in itself. Usually the goal is to save time. Since you presumably can only edit code you have written, what you need to know is where to look in your code for speedups. For example if you are calling `dgemm` more than you need to, you could try to call it less. If the matrices you are calling it with are small, chances are an ad-hoc routine would save time. This is why I'm such a pest on this subject, [*as here*](http://scicomp.stackexchange.com/a/2719/1262). Being in lib routine XYZ only matters if you can avoid it. – Mike Dunlavey Sep 15 '15 at 20:29
0

Just Google the offending function names. The ones you show above are defined in LAPACK. dtrsm is for solving a matrix equation. dgemm is for multiplying matrices.

What you need to know is 1) why they are being called, and 2) how big the matrices are.

To find out why they are being called, what I do is just examine individual stack samples, as here.

The reason matrix size matters is if they are small, these LAPACK routines can actually spend a relatively large fraction of their time just classifying their inputs, such as by calling a function LSAME.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
  • What if I am using multiple libraries that contain functions with the same name? Would there be a way to determine which version is being executed? – benjimin Sep 03 '15 at 01:23
  • @benjimin: then the linker has the same problem - it has to choose between multiple definitions. In any case, if you look at a stack sample that includes the function in question, you can see the line at which the call takes place, and that might help disambiguate it. – Mike Dunlavey Sep 03 '15 at 12:37