0

My company's codebase has a unit test that takes unreasonably long (5 minutes wall clock time, 200ms cpu-time). It is not I/O bound, so it is probably sleeping/waiting somewhere. How would one go about discovering where?

  • I can't do a textual search. The actual repository is huge and has legitimate reasons to sleep. I'd get too many false positives
  • Profilers I tried (perf, valgrind, gprof) all seem to focus on finding functions that use a lot of cpu time, not wall clock time.

I ended up doing strace -r -k and then looking for slow system calls, but surely there are more convenient approaches?

Kees-Jan
  • 518
  • 4
  • 16
  • 1
    There's quite a bit of code involved as well as multiple threads, so simply stepping through the code is not very practical. With this insane wall-clock/cpu-time ratio, I might have run the program in the debugger and hit ctrl-c with reasonable certainty of hitting a sleep, but I'm hoping to receive answers that help me tackle more subtle cases in the future. – Kees-Jan Nov 14 '22 at 15:46
  • Sounds like you need to collect a core dump and look for blocked threads – kvr Nov 14 '22 at 15:51
  • @kvr That seems equivalent to hitting ctrl-c in the debugger: It'll give me one sleep at a time at best. – Kees-Jan Nov 14 '22 at 15:57
  • You should [edit] your question to add the information from your comment. See also https://stackoverflow.com/q/2803930/10622916 or https://github.com/jasonrohrer/wallClockProfiler or https://unix.stackexchange.com/q/608463/330217 (found by searching for keywords gprof-wall-clock time) – Bodo Nov 15 '22 at 12:00

0 Answers0