0

I want to get one process branch prediction data in centos7. So I use "perf stat -B -e branches,branch-misses ./a.out" to test for the code below.

#include <algorithm>
#include <vector>
#include <iostream>

int main()
{
    // generate data
    const size_t arraySize = 32768;
    std::vector<int> data(arraySize);

    for (unsigned c = 0; c < arraySize; ++c)
        data[c] = std::rand() % 256;


    // If the data are sorted like shown here the program runs about
    // 6x faster (on my test machine, with -O2)
   // std::sort(data.begin(), data.end());

    long long sum = 0;

    for (unsigned i = 0; i < 100000; ++i)
    {
        for (unsigned c = 0; c < arraySize; ++c)
        {
            if (data[c] >= 128)
                sum += data[c];
        }
    }
    std::cout << "sum = " << sum << std::endl;
}

the code come from another question: "Why is processing a sorted array faster than processing an unsorted array?"

But the result I get is always zero: enter image description here

I think it's impossible for branch and branch prediction miss is zero. Can someone help me point out the cause? Thanks.

DSBDO
  • 73
  • 10
  • It's possible for `branches` to be zero if you only count user-space, and your process is a statically linked executable that only makes an `_exit` system call directly from `_start`. In your case it's obviously bogus. You don't get any error messages from `perf`? What CPU do you have? – Peter Cordes Mar 23 '20 at 06:51
  • perf didn't throw any error msg. My machine CPU is "Intel(R) Xeon(R) CPU E5-26xx v4" 16core – DSBDO Mar 23 '20 at 06:57
  • and I have try a lot of program. such as: "perf stat ls". But the result of branch and branch-miss is always zero. – DSBDO Mar 23 '20 at 06:59
  • Do other counters work? Like run `perf stat -d /bin/ls` to use the default set of events + some cache events. If you're in a VM, probably it's not passing through HW counters at all, but for some reason your kernel letting it try and get 0. Output should look like [Interpretation of perf stat output](https://stackoverflow.com/q/29046457) – Peter Cordes Mar 23 '20 at 07:01
  • I have tried some programe again. And I found that only the counter of "task-clock (msec)","page-faults" work. other counter were all zero. Do this is caused by the VM environment? – DSBDO Mar 23 '20 at 07:10
  • 2
    task-clock and page-faults are kernel counters, not hardware. So yes, almost certainly the problem is running in a VM. – Peter Cordes Mar 23 '20 at 07:11
  • Oh. thanks for help. So are there any method to get the counter info in VM?I have search google a long time. But didn't found any solution. – DSBDO Mar 23 '20 at 07:14
  • I'm not aware of any VMs that support PMU access by guests. I think people were/are working on KVM support for it. – Peter Cordes Mar 23 '20 at 07:17
  • Ok, thank you very much. – DSBDO Mar 23 '20 at 07:21

0 Answers0