2

I want to do some comparison between the outputs of perf-stat to that of likwid-perfctr. Is there a way to do that. I tried running two commands, one for perf-stat, and the other for liquid-perfctr. The commands are:

sudo perf stat -C 2 -e instructions, BR_INST_RETIRED.ALL_BRANCHES,branches,rc004,INST_RETIRED.ANY ./loop

sudo likwid-perfctr -C 2 -g MYLIST1 -f ./loop

The first instruction is related to perf-stat which captures importantly branches, and instructions count redundantly. The second instruction is related to likwid-perfctr which captures similar data. Just to mention I wrote my own group called MYLIST1 for likwid-perfctr.

But when I compare both the results, its turning out to be quite different. Output Comparison

So, when we look into the output, INSTR_RETIRED_ANY in perf stat are: 15552, to that of likwid-perfctr are: 190594. And branches are: 3168 vs 42744.

I'm not sure what I'm doing wrong. Or is there any way to properly do that.

  • Your loop program is too short, it can be useful to make it work for longer period of time. perf stat and likwid-perfctr may have different configuration to count events when kernel code is working for your program. For [perf stat](https://man7.org/linux/man-pages/man1/perf-stat.1.html) you can add ':u' suffix to every event to count only over user-space code (you program). Repeating of measure for several times may help too. – osgx Apr 27 '21 at 13:52
  • I don't think loop is an issue. Because, I ran loop with various kind of values like, (for i=0; i<10; i++) and I replaced 10 with values upto 1000. I also think that, the smaller the loop value, the perf and likwid-perfctr difference should be lowered. Isn't it? Coming to the other suggestion about adding :u, I believe that will further reduce perf values. Isn't it? – Kumar Thummapudi May 02 '21 at 17:37
  • Kumar, could you try to use very high count in the loop, for example 100 millions and 1000 millions, or other with typical program run time like 10 seconds or 100 seconds. This may lower the overhead part of program and may make results of perf and perfctr easier to compare. – osgx Jul 02 '21 at 04:09

0 Answers0