I am trying to profile my code with Random Pause Method. I essentially run the code under GDB session and look to the call stack at random times using Ctrl+c and backtrace
command. It seems to work: the slowest part of the code (a loop) is in the stack on almost all pauses and I can find a pattern of what is in the stack.
Here is the problem. I'm trying to automate the profiling process with the following shell script:
while true
do
if [ -n "$(pidof MyCode)" ]; then
gdb -ex "set pagination 0" -ex "thread apply all bt" -batch -p $(pidof MyCode) >> Log.txt
sleep 1
else
break
fi
done;
When I check the output file Log.txt
, I unexpectedly never see the slow part of the code in the stack!
Q: How profiling within a GDB session and profiling by calling GDB from a script can give different results?
Some notes:
- I tried both methods many times using different number of samples (from 5 to 50)
- It is a C++ code
- The slow function is a loop parallelized with OpenMP
- I can't show the code here. I tried to reproduce this behaviour in a smaller code, but no success
EDIT: I think I have a clue of what is going on here. The fact that the code is multithreaded has something to do with that.
In the script above, if I put the gdb
command between kill -SIGSTOP $pid
and kill -SIGCONT $pid
and set the variable GOMP_CPU_AFFINITY
, I get similar results of using a GDB session. My guess is that the script can't execute the gdb
command when the code is in the parallel loop because all cores are busy.