I like the fp
method for collecting call stacks with perf record
since it's lightweight and less complex than dwarf
. However, when I look at the call stacks/flamegraphs I get when a program uses the C++ standard library, they are not correct.
Here is a test program:
#include <algorithm>
#include <iomanip>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
int __attribute__((noinline)) stupid_factorial(int x) {
std::vector<std::string> xs;
// Need to convert numbers to strings or it will all get inlined
for (int i = 0; i < x; ++i) {
std::stringstream ss;
ss << std::setw(4) << std::setfill('0') << i;
xs.push_back(ss.str());
}
int res = 1;
while(std::next_permutation(xs.begin(), xs.end())) {
res += 1;
};
return res;
}
int main() {
std::cout << stupid_factorial(11) << "\n";
}
And here is the flame graph:
It was generated by the following steps on Ubuntu 20.04 in a Docker container:
g++ -Wall -O3 -g -fno-omit-frame-pointer program.cpp -o 6_stl.bin
# Make sure you have libc6-prof and libstdc++6-9-dbg installed
env LD_LIBRARY_PATH=/lib/libc6-prof/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu/debug:${LD_LIBRARY_PATH} perf record -F 1000 --call-graph fp -- ./6_stl.bin
# Make sure you have https://github.com/jonhoo/inferno installed
perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg
The main thing that's wrong with this is that not all functions are children of stupid_factorial
, e.g. __memcmp_avx2_movbe
. With dwarf
, they are. In more complex programs, I have even seen functions like these being outside main
. __dynamic_cast
for instance is one that often has no parent.
In gdb
, I always see correct backtraces, including for the functions that do not appear correctly here. Is it possible to get correct fp
call stacks with libstdc++
without compiling it myself (which seems like a lot of work)?
There are also other oddities, though I couldn't reproduce them in Ubuntu 18.04 (outside the Docker container):
- There is an unresolved function in
libstdc++.so.6.28
. - There is an unresolved function in my own binary,
6_stl.bin
, on the very left. This is also the case withdwarf
.