I really like the idea of the Flame Graph for profiling since it will help in eliminating unneeded function calls. There is a catch however in that it requires the profiler to do a complete stack dump each time it collects a sample. This can be accomplished with DTrace or SystemTap quite easily, but I need to be able to do this on an ARM device running ubuntu (which eliminates DTrace). I would also like to do this without recompiling the kernel (which eliminates SystemTap).
Is it possible to get Valgrind/Callgrind or OProfile (or some other profiling tool that can run on an ARM device in Ubuntu) to output something similar to:
dtrace -n 'profile-1001 /pid == 12345 && arg1/ { @[ustack()] = count(); }