5

Edit: added important note that it is about debugging MPI application

System installed shared library doesn't have debugging symbols:

$ readelf -S /usr/lib64/libfftw3.so | grep debug
$

I have therefore compiled and instaled in my home directory my owne version, with debugging enabled (--with-debug CFLAGS=-g):

$ $ readelf -S ~/lib64/libfftw3.so | grep debug
  [26] .debug_aranges    PROGBITS         0000000000000000  001d3902
  [27] .debug_pubnames   PROGBITS         0000000000000000  001d8552
  [28] .debug_info       PROGBITS         0000000000000000  001ddebd
  [29] .debug_abbrev     PROGBITS         0000000000000000  003e221c
  [30] .debug_line       PROGBITS         0000000000000000  00414306
  [31] .debug_str        PROGBITS         0000000000000000  0044aa23
  [32] .debug_loc        PROGBITS         0000000000000000  004514de
  [33] .debug_ranges     PROGBITS         0000000000000000  0046bc82

I have set both LD_LIBRARY_PATH and LD_RUN_PATH to include ~/lib64 first, and ldd program confirms that local version of library should be used:

$ ldd a.out | grep fftw
        libfftw3.so.3 => /home/narebski/lib64/libfftw3.so.3 (0x00007f2ed9a98000)

The program in question is parallel numerical application using MPI (Message Passing Interface). Therefore to run this application one must use mpirun wrapper (e.g. mpirun -np 1 valgrind --tool=callgrind ./a.out). I use OpenMPI implementation.

Nevertheless, various profilers: callgrind tool in Valgrind, CPU profiling google-perfutils and perf doesn't find those debugging symbols, resulting in more or less useless output:

  • calgrind:

    $ callgrind_annotate --include=~/prog/src --inclusive=no  --tree=none
    [...]
    --------------------------------------------------------------------------------
                Ir  file:function
    --------------------------------------------------------------------------------
    32,765,904,336  ???:0x000000000014e500 [/usr/lib64/libfftw3.so.3.2.4]
    31,342,886,912  /home/narebski/prog/src/nonlinearity.F90:__nonlinearity_MOD_calc_nonlinearity_kxky [/home/narebski/prog/bin/a.out]
    30,288,261,120  /home/narebski/gene11/src/axpy.F90:__axpy_MOD_axpy_ij [/home/narebski/prog/bin/a.out]
    23,429,390,736  ???:0x00000000000fc5e0 [/usr/lib64/libfftw3.so.3.2.4]
    17,851,018,186  ???:0x00000000000fdb80 [/usr/lib64/libmpi.so.1.0.1]
    
  • google-perftools:

    $ pprof --text a.out prog.prof
    Total: 8401 samples
         842  10.0%  10.0%      842  10.0% 00007f200522d5f0
         619   7.4%  17.4%     5025  59.8% calc_nonlinearity_kxky
         517   6.2%  23.5%      517   6.2% axpy_ij
         427   5.1%  28.6%     3156  37.6% nl_to_direct_xy
         307   3.7%  32.3%     1234  14.7% nl_to_fourier_xy_1d
    
  • perf events:

    $ perf report --sort comm,dso,symbol
    # Events: 80K cycles
    #
    # Overhead  Command         Shared Object                                        Symbol
    # ........  .......  ....................  ............................................
    #
        32.42%  a.out     libfftw3.so.3.2.4     [.]            fdc4c
        16.25%  a.out             7fddcd97bb22  [.]     7fddcd97bb22
         7.51%  a.out     libatlas.so.0.0.0     [.] ATL_dcopy_xp1yp1aXbX
         6.98%  a.out     a.out                 [.] __nonlinearity_MOD_calc_nonlinearity_kxky
         5.82%  a.out     a.out                 [.] __axpy_MOD_axpy_ij
    

Edit Added 11-07-2011:
I don't know if it is important, but:

$ file /usr/lib64/libfftw3.so.3.2.4
/usr/lib64/libfftw3.so.3.2.4: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped

and

$ file ~/lib64/libfftw3.so.3.2.4
/home/narebski/lib64/libfftw3.so.3.2.4: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, not stripped
Jakub Narębski
  • 309,089
  • 65
  • 217
  • 230
  • 1
    If you use [Zoom](http://www.rotateright.com/) or [this method of finding time drains](http://stackoverflow.com/questions/375913/what-can-i-use-to-profile-c-code-in-linux/378024#378024) you don't need your libs to have symbols, because any problem you can fix is one or a few lines in your code, not the external librarary, and those lines are pinpointed. – Mike Dunlavey Jul 09 '11 at 13:21

3 Answers3

4

If /usr/lib64/libfftw3.so.3.2.4 is listed in callgrind output, then your LD_LIBRARY_PATH=~/lib64 had no effect.

Try again with export LD_LIBRARY_PATH=$HOME/lib64. Also watch out for any shell scripts you invoke, which might reset your environment.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • You are right. I am using `mpirun` from OpenMPI, and it does prepend `/usr/lib:/usr/lib64:` to 'LD_LIBRARY_PATH' (which I checked by running `mpirun -np 1 printenv`). I'd have to use `--prefix` or `-x` option to *mpirun*. – Jakub Narębski Jul 12 '11 at 13:35
2

You and Employed Russian are almost certainly right; the mpirun script is messing things up here. Two options:

Most x86 MPI implementations, as a practical matter, treat just running the executable

./a.out

the same as

mpirun -np 1 ./a.out.

They don't have to do this, but OpenMPI certainly does, as does MPICH2 and IntelMPI. So if you can do the debug serially, you should just be able to

valgrind --tool=callgrind ./a.out.

However, if you do want to run with mpirun, the issue is probably that your ~/.bashrc (or whatever) is being sourced, undoing your changes to LD_LIBRARY_PATH etc. Easiest is just to temporarily put your changed environment variables in your ~/.bashrc for the duration of the run.

Jonathan Dursi
  • 50,107
  • 9
  • 127
  • 158
  • No, it is not that. It is `mpirun` adding "prefix" to PATH and LD_LIBRARY_PATH, which you can check using `mpirun -np 1 printenv`. – Jakub Narębski Jul 15 '11 at 08:57
  • If I understand it correctly `valgrind --tool=callgrind ./a.out` would profile also mpirun part, something I do not want; though I have not checked if it is much of hindrance. – Jakub Narębski Jul 15 '11 at 09:30
  • prefix should only be the path to OpenMPI, which should be irrelevant (unless you put the fftw in the same place as the MPI libraries). and `valgrind --tool=callgrind ./a.out` won't magically call mpirun; it'll launch the a.out executable. – Jonathan Dursi Jul 15 '11 at 11:58
  • OpenMPI uses '/usr' prefix, and `mpirun` **prefixes** '/usr/bin' to PATH, and '/usr/lib' and '/usr/lib64' (on x86_64) to LD_LIBRARY_PATH... which means that stripped system-installed library without debugging info was used by `mpirun ... ./a.out` and not user-installed one with debugging into. – Jakub Narębski Jul 15 '11 at 20:22
  • So you do have everything installed in the same place. Not a great idea, but if you're just using whatever the package manager does for you, that'll happen; too bad. Anyway, again, the problem can be avoided by not running mpirun, as you don't need it for 1 task. If you don't believe me, try it. – Jonathan Dursi Jul 15 '11 at 20:39
  • I don't understand what you mean by *everything in the same place*. `mpirun` is in '/usr/bin', library is in '/usr/lib64'; common prefix is '/usr'. – Jakub Narębski Jul 16 '11 at 09:36
  • The fftw and mpi libraries are all in the same place. Anyway, all this is beside the point. Have you tried running `valgrind --tool=callgrind ./a.out `? – Jonathan Dursi Jul 16 '11 at 17:08
1

The way recent profiling tools typically handle this situation is to consult an external, matching non-stripped version of the library.

On debian-based Linux distros this is typically done by installing the -dbg suffixed version of a package; on Redhat-based they are named -debuginfo.

In the case of the tools you mentioned above; they will typically Just Work (tm) and find the debug symbols for a library if the debug info package has been installed in the standard location.

DaveR
  • 9,540
  • 3
  • 39
  • 58
  • What if distribution in question is **Gentoo** (and I am not an administrator)? – Jakub Narębski Jul 09 '11 at 15:49
  • 1
    @Jakub: Certainly in the case of `perf report` you can specify an alternative location to look for debuginfo files using the `--symfs` option. You'll have to check if your other tools support a similar option. – DaveR Jul 09 '11 at 17:49
  • I think I'd have to upgrade `perf` to have access to `--symfs` option; at least for 2.6.36 this option is not mentioned in documentation. – Jakub Narębski Jul 09 '11 at 20:05