How to count branch mispredictions?

Question

I`ve got a task to count branch misprediction penalty (in ticks), so I wrote this code:

int main (int argc, char ** argv) {
    unsigned long long start, end;
    FILE *f;
    f = fopen("output", "w");
    long long int k = 0;
    unsigned long long min;
    int n = atoi(argv[1]);// n1 = atoi(argv[2]);
    for (int i = 1; i <= n + 40; i++) {
        min = 9999999999999;
        for(int r = 0; r < 1000; r++) {
            start = rdtsc();
            for (long long int j = 0; j < 100000; j++) {
                if (j % i == 0) {
                    k++;
                }
            }
            end = rdtsc();
        if (min > end - start) min = end - start;
    }
    fprintf (f, "%d %lld \n", i, min);

}
fclose (f);
return 0;
}

(rdtsc is a function that measures time in ticks)

The idea of this code is that it periodically (with period equal to i) goes into branch (if (j % i == 0)), so at some point it starts doing mispredictions. Other parts of the code are mostly multiple measurements, that I need to get more precise results.

Tests show that branch mispredictions start to happen around i = 47, but I do not know how to count exact number of mispredictions to count exact number of ticks. Can anyone explain to me, how to do this without using any side programs like Vtune?

don't forget to up vote answers which are helpful and accept the one which solved your question! — Jay, Dec 12 '18 at 03:42

Jay · Accepted Answer · 2019-01-05T02:08:19.583

It depends on the processor your using, in general cpuid can be used to obtain a lot of information about the processor and what cpuid does not provide is typically accessible via smbios or other regions of memory.

Doing this in code on a general level without the processor support functions and manual will not tell you as much as you want to a great degree of certainty but may be useful as an estimate depending on what your looking for and how you have your code compiled e.g. the flags you use during compilation etc.

In general, what is referred to as specular or speculative execution and is typically not observed by programs as their logic which transitions through the pipeline is determined to be not used is then discarded.

Depending on how you use specific instructions in your program you may be able to use such stale cache information for better or worse but the logic therein would vary greatly depending on the CPU in use.

See also Spectre and RowHammer for interesting examples of using such techniques for privileged execution.

See the comments below for links which have code related to the use of cpuid as well as rdrand, rdseed and a few others. (rdtsc)

It's not completely clear what your looking for perhaps but will surely get you started and provide some useful examples.

How to count branch mispredictions?

1 Answers1