Brendan D. Gregg (author of DTrace book) has interesting variant of profiling: the "Off-CPU" profiling (and Off-CPU Flame Graph; slides 2013, p112-137) to see, where the thread or application were blocked (was not executed by CPU, but waiting for I/O, pagefault handler, or descheduled due short of CPU resources):
This time reveals which code-paths are blocked and waiting while off-CPU, and for how long exactly. This differs from traditional profiling which often samples the activity of threads at a given interval, and (usually) only examine threads if they are executing work on-CPU.
He also can combine Off-CPU profile data and On-CPU profile together: http://www.brendangregg.com/FlameGraphs/hotcoldflamegraphs.html
The examples given by Gregg are made using dtrace
, which is not usually available in Linux OS. But there are some similar tools (ktap, systemtap, perf) and the perf
as I think has widest installed base. Usually perf
generated On-CPU profiles (which functions were executed more often on CPU).
- How can I translate Gregg's Off-CPU examples to
perf
profiling tool in Linux?
PS: There is link to systemtap variant of Off-CPU flamegraphs in the slides from LISA13, p124: "Yichun Zhang created these, and has been using them on Linux with SystemTap to collect the profile data. See: • http://agentzh.org/misc/slides/off-cpu-flame-graphs.pdf"" (CloudFlare Beer Meeting on 23 August 2013)