3

I want to profile C++ program on Linux using random sampling that is described in this answer:

However, if you're in a hurry and you can manually interrupt your program under the debugger while it's being subjectively slow, there's a simple way to find performance problems.

The problem is that I can't use gdb debugger because I want to profile on production under heavy load and debugger is too intrusive and considerably slows down the program. However I can use perf record and perf report for finding bottlenecks without affecting program performance. Is there a way to collect a number of readable (gdb like) stack traces with perf instead of gdb?

ks1322
  • 33,961
  • 14
  • 109
  • 164
  • IIRC, Chandler Carruth mentions compiling with frame pointers enabled (`-fno-omit-frame-pointer`) to let perf efficiently collect stack backtraces in his CppCon2015 talk about `perf`: https://www.youtube.com/watch?v=nXaxk27zwlk. But I forget what perf options he then uses to tell `perf` it can use frame pointers and to get it to even collect parent callers. It's a very good video, worth watching. – Peter Cordes Mar 18 '18 at 01:15

1 Answers1

1

perf does offer callstack recording with three different techniques

  • By default is uses the frame pointer (fp). This is generally supported and performs well, but it doesn't work with certain optimizations. Compile your applications with -fno-omit-frame-pointer etc. to make sure it works well.
  • dwarf uses a dump of the sack for each sample for post-processing. That has a significant performance penalty
  • Modern systems can use hardware-supported last branch record, lbr.

The stack is accessible in perf analysis tools such as perf report or perf script.

For more details check out man perf-record.

Zulan
  • 21,896
  • 6
  • 49
  • 109
  • Thanks, do you know how to add line numbers to stack traces in `perf script`? I tried `perf script -F +srcline`, but it seems they are not getting added. Is it `perf script` bug or am I doing it wrong? – ks1322 Mar 19 '18 at 18:20