0

Here is some c++ pseudo-code as an example:

bool importantFlag = false;
for (SomeObject obj : arr) {
    if (obj.someBool) {
        importantFlag = true;
    }
    obj.doSomethingUnrelated();
}

Obviously, once the if-statement evaluates as true and runs the code inside, there is no reason to even perform the check again since the result will be the same either way. Is the compiler smart enough to recognize this or will it continue checking the if-statement with each loop iteration and possibly redundantly assign importantFlag to true again? This could potentially have a noticeable impact on performance if the number of loop iterations is large, and breaking out of the loop is not an option here.

I generally ignore these kinds of situations and just put my faith into the compiler, but it would be nice to know exactly how it handles these kinds of situations.

Greg
  • 61
  • 5
  • Godbolt might help you understand some compiler optimizations: https://godbolt.org/z/fbEnabK6n – selbie Jul 02 '22 at 05:36
  • 2
    Compilers don't do branch prediction. Compilers generate a set of instructions that act on data. CPUs, when executing a stream of instructions, do branch prediction (attempting to pick what branch they need to go down) based on heuristics to reason about what branch might be taken next. The compiler doesn't have access to the data your code might process at run time, but the CPU can collect that information. – Peter Jul 02 '22 at 06:13
  • Which compiler? I.e. version, target arch, settings etc? In any case, this is called "constant propagation" and whether it is used can be answered by looking at the generated code. – Ulrich Eckhardt Jul 02 '22 at 07:26
  • 1
    The compiler doesn't do branch prediction, its the CPU. However from C++20 you can give a hint to which branch of your code is most likely to take: https://en.cppreference.com/w/cpp/language/attributes/likely (and that will help branch prediction) – Pepijn Kramer Jul 02 '22 at 10:00
  • @PepijnKramer: Hints like `[[likely]]` *don't* usually help with run-time branch prediction. See [Is there a compiler hint for GCC to force branch prediction to always go a certain way?](https://stackoverflow.com/a/30136196) . Recent Intel CPUs don't do static prediction anymore even for branches they haven't seen recently, which is the only way it could matter for most ISAs. See also [Is it possible to tell the branch predictor how likely it is to follow the branch?](https://stackoverflow.com/a/1851445) . It's useful to the compiler to know which path to optimize for, regardless. – Peter Cordes Jul 02 '22 at 10:50

3 Answers3

12

Branch-prediction is a run-time thing, done by the CPU not the compiler.

The relevant optimization here would be if-conversion to a very cheap branchless flag |= obj.someBool;.


Ahead-of-time C++ compilers make machine code for the CPU to run; they aren't interpreters. See also Matt Godbolt's CppCon2017 talk “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid” and How to remove "noise" from GCC/clang assembly output? for more about looking at optimized compiler-generated asm.

I guess what you're suggesting could be making a 2nd version of the loop that doesn't look at the bool, and converting the if() into an if() goto to set the flag once and then run the other version of the loop from this point onward. That would likely not be worth it, since a single OR instruction is so cheap if other members of the same object are already getting accessed.

But it's a plausible optimization; however I don't think compilers would typically do it for you. You can of course do it manually, although you'd have to iterate manually instead of using a range-for, because you want to use the same iterator to start part-way through the range.


Branch likelihood estimation at compile time is a thing compilers do to figure out whether branchy or branchless code is appropriate, e.g. gcc optimization flag -O3 makes code slower than -O2 uses CMOV for a case that looks unpredictable, but when run on sorted data is actually very predictable. My answer there shows the asm real-world compilers make; note that they don't multi-version the loop, although that wouldn't be possible in that case if the compiler didn't know about the data being sorted.

Also to guess which side of a branch is more likely, so they can lay out the fast path with fewer taken branches. That's what the C++20 [[likely]] / [[unlikely]] hints are for, BTW, not actually influencing run-time branch prediction. Except on some CPUs, indirectly via static prediction the first time a CPU sees a branch. Or a few ISAs, like PowerPC and MIPS, have "branch-likely" instructions with actual run-time hints for the CPU which compilers might or might not actually use even if available. See

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    That godbolt video (which I've seen before) is gold! – Pepijn Kramer Jul 02 '22 at 10:02
  • 1
    @PepijnKramer: Hell yeah. Along with https://www.lighterra.com/papers/modernmicroprocessors/ for cpu-architecture stuff, it's one of those links worth throwing in to an answer or comment on almost any question where it's relevant. (BTW, Godbolt should be capitalized; it's his last name. If we're talking about the site he named after himself, at this point it's basically a common noun so we can get away with "godbolt", but here we're talking about Matt himself giving a talk, so definitely Godbolt.) – Peter Cordes Jul 02 '22 at 10:47
1

If you expect to have a large data set you could just have two for loops, the first of them breaking when the importantFlag is set to true. It's hard to know specifically what optimizations the compiler will make since it's not well documented.

SetupWizard
  • 119
  • 8
1

Peter Cordes has already given a great answer.

I'd also like to mention shortcircuiting

In this example

if( importantFlag || some_expensive_check() ) {
        importantFlag = true;
}

Once important Flag is set to true, the expensive check will never be performed, since the || stops at the first true.

infinitezero
  • 1,610
  • 3
  • 14
  • 29