I am biting my teeth out on this one...
I need to do profiling on an ARM board and need to view call graphs. I tried with OProfile, Kernel perf and Google performance tools. All work fine but do not output any call-graph information.
This led me to the conclusion that I am not compiling my code correctly.
I use the following flags when compiling my C++ code:
Arch specific:
-march=armv7-a -mtune=cortex-a8 -mfloat-abi=hard -mfpu=vfpv3
General:
-fexceptions -fno-strict-aliasing -D_REENTRANT -Wall -Wextra
Debugging (with optimization):
-O2 -g -fno-omit-frame-pointer
I did a lot of Google searching and found some related topics:
- libunwind ?
- dwarf
- (asynchronous-)unwind-tables
-mapcs-frame
However I do not fully understand how these are all connected. Any hints on how to get call graphs working?
Note (due to Rian's answer): I am interested in finding out if and why some methods take longer (in relation to others) on ARM than x86-64. It does not help to do this on a different platform (Even though my code compiles on both and I can do call-graphs on x86-64).