1

According to C++ branch-aware prediction, I prepared a test to see how much effective it is.

So, in a control sample, I write:

int count=0;
for (auto _ : state) {
    if(count%13==0) {
        count+=2;
    }
    else
        count++;
    benchmark::DoNotOptimize(count);
}

In a C++11 branch-prediction, I write:

#define LIKELY(condition) __builtin_expect(static_cast<bool>(condition), 1)
#define UNLIKELY(condition) __builtin_expect(static_cast<bool>(condition), 0)

int count=0;
for (auto _ : state) {
    if(UNLIKELY(count%13==0)) {
        count+=2;
    }
    else
        count++;
    benchmark::DoNotOptimize(count);
}

In a C++20,

int count=0;
for (auto _ : state) {
    if(count%13==0)[[unlikely]]{
        count+=2;
    }
    else
        count++;
    benchmark::DoNotOptimize(count);
}

which unfortunately is not supported under quick-bench. But anyway, I leave it there.

Now, getting the benchmark under gcc and clang shows no effectiveness for such a basic example.

Am I doing anything wrong?

c++ branch-aware prediction

ar2015
  • 5,558
  • 8
  • 53
  • 110
  • 1
    I appreciate downvoter if he/she gives explanations. – ar2015 Sep 27 '18 at 14:42
  • 2
    Why do you expect `count+=2` to take different time to execute, than `count++`? – Algirdas Preidžius Sep 27 '18 at 14:42
  • 1
    @AlgirdasPreidžius, The performance comes from `unlikely` and branch prediction, not from how much I add to `count`. The `if` and `else` should be different. Otherwise they are optimized away. – ar2015 Sep 27 '18 at 14:44
  • @AlgirdasPreidžius, I follow [this](https://stackoverflow.com/questions/30130930/is-there-a-compiler-hint-for-gcc-to-force-branch-prediction-to-always-go-a-certa). – ar2015 Sep 27 '18 at 14:47
  • That was not what I was asking. Both of the branches, contain the functionality, that is so lightweight, that it isn't even measurable. If the test case, was so, that `if (UNLIKELY (/* Complicated condition */)) { /* Do something heavy */ } else {/* Do something lightweight */}`, one might see a difference. – Algirdas Preidžius Sep 27 '18 at 14:47
  • 1
    How do you know the compiler isn't optimising the code even without the help of the branch prediction features? Transforming `if(count%13==0) { count+=2;} else count++;` to something equivalent `such as ++count; if (!(count%13)) ++count;` is not exactly a huge step for an optimiser. Nor is unrolling the loop, to remove most of the branching. – Peter Sep 27 '18 at 14:48
  • It is my understanding that `unlikely` style attributes shine when a branch predictor would not be able to recognize a pattern well enough. You give it a fairly easy one, me thinks. – StoryTeller - Unslander Monica Sep 27 '18 at 14:48
  • @AlgirdasPreidžius I don't think the point is to measure work done *in* the branch but to measure CPU cycle loss due to branch miss-prediction. At least that's my interpretation of the question. – François Andrieux Sep 27 '18 at 14:49
  • @FrançoisAndrieux Sure, but if the branches are this lightweight, there is little to no performance impact, due to branch misprediction. If one of the branches are heavier, the effects of branch misprediction becomes more pronounced. – Algirdas Preidžius Sep 27 '18 at 14:50
  • @AlgirdasPreidžius the lighter the branch is, the less it disturb the measurement. – ar2015 Sep 27 '18 at 14:51
  • Generated assembly seems to be exactly the same apart from the function name. So they work the same. Maybe the compiler ignored hint? – luk32 Sep 27 '18 at 14:51
  • @ar2015 You seem to be ignoring (or not reading) what I am stating, so I will not repeat myself. You are comparing against the control, of "no compiler hints" anyway. – Algirdas Preidžius Sep 27 '18 at 14:55
  • @luk32, I see a bit of difference at the bottom of the code. – ar2015 Sep 27 '18 at 14:55
  • @AlgirdasPreidžius, I actually didn't see one of your comments. If branches have equivalent size or different size, does it make difference? – ar2015 Sep 27 '18 at 14:57
  • @ar2015 I don't. Neither [here](http://quick-bench.com/euX9nqZWMuG1zeg0Z3U1r7F_GV8), the assembly is exactly the same for `LIKELY` and `UNLIKELY` macro, except for the jump, which is the function address. I tested gcc. – luk32 Sep 27 '18 at 15:04
  • with `[UN]LIKELY` compiler can generate different code compare without this attribute. if without usual used symmetric case when both `if` and `else` block execute 1 jump (one at begin and 1 at the end). if we use `LIKELY` attribute code became asymmetric - likely block not execute jump at all and unlikely block execute 2 jump. as result likely path became more faster and unlikely block became more slow – RbMm Sep 27 '18 at 15:45
  • Note: Whenever you benchmark performance, make sure you are building with optimization enabled. Not saying you didn't I just didn't see that in the question. – Jesper Juhl Sep 27 '18 at 15:56
  • in what question ? how internal *[un]likely* change code and which effect this can have or why your test not show difference ? – RbMm Sep 27 '18 at 16:08

2 Answers2

6

Your bench shows no difference because the CPU's branch prediction is as good as gcc __builtin_expect to optimize your trivial example.

For a full in-depth explanation of what branch prediction is, see this excellent answer on Stack Overflow.

YSC
  • 38,212
  • 9
  • 96
  • 149
  • Does it mean `branch-prediction` in `C++` is practically useless? – ar2015 Sep 27 '18 at 14:54
  • 3
    @ar2015 - No. It means the compiler can optimise simple test cases like yours quite effectively with or without the `branch-prediction` hint. If you want to really exercise the feature, use a test case where the compiler struggles more to optimise, and branch prediction is likely a more significant consideration – Peter Sep 27 '18 at 15:00
  • @Peter, let's consider an unpredictable case. [clang](http://quick-bench.com/InaeqSkjcOpfZSnn9QqtzOVqAak) and [gcc](http://quick-bench.com/KbRhGwKvdmJSsShxHN0sLaol1kU) give not incentive to use any branch-prediction. – ar2015 Sep 27 '18 at 15:20
  • @ar2015 - in case *gcc* use of *[un]likely* very noticeably changes the code. https://godbolt.org/z/kJPHJP – RbMm Sep 27 '18 at 16:00
  • @ar2015 - You think that case is "unpredictable". I do not. It is a fairly trivial change of your original example, actually. I wouldn't be surprised if a compiler can do enough analysis that it can disregard hints you give it. It's not exactly unusual for a compiler to be able to analyse simple code better than a mere human. – Peter Sep 28 '18 at 09:11
  • More importantly, not all CPUs have branch-prediction. – Yakk - Adam Nevraumont Oct 02 '18 at 17:06
3

Your three tests are all the same.

The first example the compiler guesses which path is hot, it guesses the second one and away it goes.

In the second example you tell the compiler that the second path is hot, so the result is the same as the first.

I reran the tests explicitly specifying a likely and unlikely branch, the following is the result using Clang.

Clang benchmark branch prediction

http://quick-bench.com/-GZXTQk6hvKxm19s8FzgyzgHO0I

Running this test in GCC 8.2 still shows no difference, I believe because it optimises out the branches entirely.

lod
  • 1,098
  • 10
  • 13