0
perf stat ./myapp

and the result must be like this (it's just an example)

   Performance counter stats for 'myapp':

    83723.452481      task-clock:u (msec)       #    1.004 CPUs utilized
               0      context-switches:u        #    0.000 K/sec
               0      cpu-migrations:u          #    0.000 K/sec
       3,228,188      page-faults:u             #    0.039 M/sec
 229,570,665,834      cycles:u                  #    2.742 GHz
 313,163,853,778      instructions:u            #    1.36  insn per cycle
  69,704,684,856      branches:u                #  832.559 M/sec
   2,078,861,393      branch-misses:u           #    2.98% of all branches

    83.409183620 seconds time elapsed

    74.684747000 seconds user
     8.739217000 seconds sys

Perf stat prints user time and system time, and the HW counter will be incremented whatever application the cpu executes.

For HW counters like cycles or instructions, does the perf count them only for "myapp"?

For instance, (cs for context switch)

          |--------------------|-------|-------------------|------------|------------------|
                  myapp       cs       cs      myapp      cs            cs                end
inst      0                   10       20                  50           80                100

60 instructions for "myapp" , but the value of HW counter is 100, then does the perf stat prints out 60?

lemoncake
  • 41
  • 6
  • You didn't use `-a` so it's for your process. Also it's not even including kernel mode (e.g. async interrupts, or system calls made by your process) due to permissions of the system-wide `perf_event_paranoid` setting: [What restriction is perf\_event\_paranoid == 1 actually putting on x86 perf?](https://stackoverflow.com/q/51911368) – Peter Cordes May 12 '22 at 02:33
  • Closest duplicate I could find after a bit of searching was [using "Perf stat" to profile both process and system-wide events simultaneously](https://stackoverflow.com/q/70447895). Or [Profiler that attaches to running processes?](https://stackoverflow.com/q/5071665) but that mentions perf record not perf stat. (They're the same, but the answer there doesn't say so, otherwise it would work as a duplicate for this.) – Peter Cordes May 12 '22 at 02:38
  • @PeterCordes Even if perf doesn't include kernel mode, HW counter will be inceased, isn't it? And perf reads HW counter which includes kernel mode instructions? – lemoncake May 12 '22 at 02:54
  • On x86 at least, each HW counter has two bits to control whether it counts in kernel mode and/or user mode. The PAPI system calls that `perf` uses don't allow setting the count-in-kernel bit at higher `paranoid` levels, like you're clearly using since you didn't use `--all-user` but still got `:u` (user-space only) counts. (I normally use `--all-user` when microbenchmarking a loop to test details of how my CPU runs certain instructions or combinations.) So you get user-space-only counts without the kernel even having to save/restore the counters when entering the kernel, only on context-sw. – Peter Cordes May 12 '22 at 10:12

0 Answers0