0

In my code there's a regular: if a if statement is true, it will keep true for a while, and if it changes to false, it will keep false for a while. Since the performance in this code matters, I want to make the branch predict more efficient.

Currently what I tried is to write two versions of this if statement, one is optimized with "likely" and the other is optimized with "unlikely" and use a function pointer to save which one to use, but since function pointer breaks the pipeline either, the benchmark seems no different with normal if statement. So I'm curious if there's any tech to let CPU "remember" the last choice of this if statement?

Or, do I really need to care about this?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 2
    The CPU's branch predictor is already going to do this. If the first couple times are the same it will keep using that branch until it changes and you get a miss. Then it will re-evaluate. – NathanOliver Aug 24 '20 at 16:06
  • CPUs already do this, it’s called branch prediction. – mkrieger1 Aug 24 '20 at 16:06
  • Does this answer your question? [Why is processing a sorted array faster than processing an unsorted array?](https://stackoverflow.com/questions/11227809/why-is-processing-a-sorted-array-faster-than-processing-an-unsorted-array) – mkrieger1 Aug 24 '20 at 16:07
  • How could the CPU remember so many `if` statements in my code? Or does that mean if the code is moved out of the CPU cache, the result is forgot by the CPU? –  Aug 24 '20 at 16:08
  • 1
    @ravenisadesk There should be at lest one buffer where it keeps track of which instructions produce which value. – NathanOliver Aug 24 '20 at 16:11
  • You *can* use the "likely" and "unlikely" attributes in C++20 - https://en.cppreference.com/w/cpp/language/attributes/likely , but it's a rather rare case where you need to and can manage to out-do the CPUs branch predictor. In general; don't worry about it. – Jesper Juhl Aug 24 '20 at 16:13

1 Answers1

3

If it stays the same for a while, the branch predictor will figure that out pretty quickly. That's why sorting an input sometimes makes code run significantly faster; the random unsorted data keeps changing the test result back and forth with no pattern the branch predictor can use, but with sorted data, it has long runs where the branch is always taken or always not taken, and that's the easiest case for branch predictors to handle.

Don't overthink this; let the branch predictor do its job. You don't need to care about it.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • Hi, thanks for answering, but what if the `if` statement is moved out of the CPU cache? Does it mean that CPU has to re-evaluate it? –  Aug 24 '20 at 16:11
  • 5
    BUT, if you do want to over think this, and (possibly) overengineer a solution, make *sure* you profile to see how much more performant the hand-optimized complicated code is over the regular optimized straightforward code. If there is no benefit, or if performance is worse, then ditch the complicated code. – Eljay Aug 24 '20 at 16:11
  • 1
    ^ That is a must. If you don't profile your code, you can't actually know where it needs to be speed up. You might find it's a completely different bit of code that's causing all of the performance issues. – NathanOliver Aug 24 '20 at 16:12
  • 2
    @ravenisadesk: Depends on the branch predictor. But if your `if` statement is being evaluated so infrequently that it's regularly dropping out of the cache, it's not relevant to your overall performance. As a rule, 90% of your time is spent in 10% of your code (sometimes phrased as 80% and 20%, but same principle), and if it's dropping out of cache, it's probably not part of that critical 10%, and all the microoptimizations in the world won't make a meaningful difference in your overall performance. As the others have noted, you *need* profiling to know what to optimize in the first place. – ShadowRanger Aug 24 '20 at 16:14
  • @ShadowRanger, thanks a lot, I think now I'm more clear with how the branch prediction works, looks that I'm over thinking the performance way too much. –  Aug 24 '20 at 16:19