In the following example, I assume that functions f1
-f4
are slow, but short and inlined. It is clear to me on iteration i=j
that the taken branch of iteration i=j+1
is dependent on the value of data[j+1]
so I can predict it in advance during the computation of iteration i=j
.
How can I help the x86 branch predictor to see this? Or maybe it already sees it without any changes from my side? If yes, how does it work?
int foo(int* data, int n) {
int x = 0;
for (int i = 0; i < n; i++) {
switch (data[i]) {
case 0: x = f1(x); break;
case 1: x = f2(x); break;
case 2: x = f3(x); break;
default: x = f4(x); break;
}
}
return x;
}