1

My results for a short run of my program are as follows:

 67.93      3.24     3.24                             grid::rKfour(int, int)
  9.43      3.69     0.45                             alloc_mmap
  5.03      3.93     0.24    30001     0.01     0.01  grid::timeStep()
  3.04      4.08     0.15 42007105     0.00     0.00  linkers::linkers(linkers const&)
  2.94      4.22     0.14  6360900     0.00     0.00  particle::fulldistance(particle&)
  2.73      4.35     0.13                             blas_thread_server
...

The output from ldd is

linux-vdso.so.1 =>  (0x00007fffe2bea000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007eff34595000)
    libboost_filesystem.so.1.46.1 => /usr/lib/libboost_filesystem.so.1.46.1 (0x00007eff34377000)
    libboost_system.so.1.46.1 => /usr/lib/libboost_system.so.1.46.1 (0x00007eff34172000)
    libGL.so.1 => /usr/lib/x86_64-linux-gnu/mesa/libGL.so.1 (0x00007eff33f16000)
    libglut.so.3 => /usr/lib/libglut.so.3 (0x00007eff33cd0000)
    libGLU.so.1 => /usr/lib/x86_64-linux-gnu/libGLU.so.1 (0x00007eff33a62000)
    libGLEW.so.1.5 => /usr/lib/libGLEW.so.1.5 (0x00007eff3380c000)
    libboost_thread.so.1.46.1 => /usr/lib/libboost_thread.so.1.46.1 (0x00007eff335f3000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007eff332eb000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007eff33067000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007eff32e51000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007eff32ab1000)
    /lib64/ld-linux-x86-64.so.2 (0x00007eff347c4000)
    libglapi.so.0 => /usr/lib/x86_64-linux-gnu/libglapi.so.0 (0x00007eff3288d000)
    libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007eff32555000)
    libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6 (0x00007eff32341000)
    libXdamage.so.1 => /usr/lib/x86_64-linux-gnu/libXdamage.so.1 (0x00007eff3213e000)
    libXfixes.so.3 => /usr/lib/x86_64-linux-gnu/libXfixes.so.3 (0x00007eff31f38000)
    libXxf86vm.so.1 => /usr/lib/x86_64-linux-gnu/libXxf86vm.so.1 (0x00007eff31d31000)
    libdrm.so.2 => /usr/lib/x86_64-linux-gnu/libdrm.so.2 (0x00007eff31b26000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007eff31922000)
    libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007eff31705000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007eff314fd000)
    libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007eff312f9000)
    libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007eff310f3000)

Can anybody identify "alloc_mmap"?

Mikhail
  • 7,749
  • 11
  • 62
  • 136
  • What is the output of `ldd your-binary`? – Bill Lynch Oct 24 '11 at 02:40
  • So I can't find it on my system, but I believe if you iterate over those libraries with `nm`, one of them should have a symbol defined that is named `alloc_mmap`. – Bill Lynch Oct 24 '11 at 02:49
  • 2
    It's probably a function below your memory allocator (malloc, new, whatever) which acquires memory via `mmap`... – rlibby Oct 24 '11 at 02:49
  • Also, another method would be to run your application in gdb, set a breakpoint in alloc_mmap, and then look at the backtrace. – Bill Lynch Oct 24 '11 at 03:05
  • 1
    Look at the call graph from your gprof output, that will give you clues as to where this function is located and who calls it. – Miguel Grinberg Oct 24 '11 at 06:48

2 Answers2

2

I assume you're asking because you want to see what you could do to improve the program's speed.
If not, forget this.

In gprof output, the number that matters is the second column, cumulative seconds, because if that routine could be made to take no time, that is the amount by which your total time would shrink.

One of the problems with gprof is it ignores blocked time like I/O. Since your program is using alloc_mmap (directly or indirectly) it is mapping a file to memory, so it is doing I/O, which is often not a small cost. gprof doesn't see it.

There are more problems with gprof. If you are on linux, you could try a profiler like Zoom. It samples on wall-clock time, so it is not blind to I/O. It also gives you percent time usage by line/instruction, not just by function, so it will pinpoint the lines in your code that, if you could improve/remove them, would give you the most speedup. (Usually these are function calls. "Self time" is rarely relevant except in heavy math or tight CPU loops, and it doesn't matter anyway. Zoom will spot it.)

The method I rely on is this.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
0

Maybe your memory allocator is using mmap for large allocations. You should first confirm this (gdb breakpoint in alloc_mmap should work) and perhaps increase the threshold with mallopt.

cdleonard
  • 6,570
  • 2
  • 20
  • 20