C++ 'if' branching: runs faster with more branches?

Question

EDIT: I know about failed branch predictions, that's the main reason that led me to ask this question. If there's more branches, I'd expect there to be more failed predictions. The first and second parts of my code are precisely the same, apart from an extra branch in the second part. This extra branch will always evaluate to true, but still requires the evaluation of i%2. Why is it faster? Compiler optimizations solve the issue, but a new one question arises. Why was it happening when the optimizations were turned off?

Original post: I am trying to understand just how if and else statements affect the performance of my codes in C++. These are the parameters and code I've used: Compiler: g++ (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 Compilation: g++ -Wall ifs_comparison.cpp -o ifs_comparison

int main(){
    clock_t t; 
    int N = 100000000;
    int even_odd[2];

    // Using if statements and being stupid with the else statement.
    // This uses more calculations than the ones that should be needed.
    even_odd[0] = 0;
    even_odd[1] = 0;
    t = clock();
    for(int i = 0; i < N; i++){
        if(i%2==0)
            even_odd[0]++;
        else
        {
            if(i%2==1)
                even_odd[1]++;
        }
    }
    t = clock() - t;
    std::cout << "Using 'if' statements and being stupid with the 'else' statement.\n";
    std::cout << "This took " << ((float)t)/CLOCKS_PER_SEC << "seconds\n";
    std::cout << "Repeat the task but with a simpler 'else' statement\n";




    // Using if statements but simplifying the else statement
    // Reset the variables
    even_odd[0] = 0;
    even_odd[1] = 0;    
    t = clock();
    for(int j = 0; j < N; j++)
    {
        if(j%2==0)
            even_odd[0]++;
        else
            even_odd[1]++;
    }   
    t = clock() - t;
    std::cout << "This took " << ((float)t)/CLOCKS_PER_SEC << "seconds\n";
    std::cout << "Repeat the task but without 'if' statements\n";



    // Not using 'if' statements
    // Reset the variables
    even_odd[0] = 0;
    even_odd[1] = 0;
    t = clock();    
    for(int k = 0; k < N; k++)
            even_odd[k%2]++;
    t = clock()-t;
    std::cout << "This took " << ((float)t)/CLOCKS_PER_SEC << "seconds\n";

    return 0;
}

The output is:

Using 'if' statements and being stupid with the 'else' statement.
This took 0.262272seconds
Repeat the task but with a simpler 'else' statement
This took 0.294445seconds
Repeat the task but without 'if' statements
This took 0.238543seconds

As expected, the part without if statements runs faster. So I thought I'd see just how much faster it'd be if I simplified the else statement by removing the evaluation i%2. To my surprise, it was actually slower! What's going on? Can someone point me to a good reference on just how these if and else statements work?

Duplicate of https://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array ? — François Andrieux, Sep 27 '17 at 18:46
Optimizer setting please. And have a read about "branch prediction". Writing meaningful benchmarks is both a skill and an art with modern CPUs: speculative execution. out of order instructions, branch prediction, 3 levels of cache etc. — Richard Critten, Sep 27 '17 at 18:47
I really don't believe it is pure branch prediction case here. The loops seems to be equal - first and second should optimize to have the same time, and they don't... It appears to be some sort of compiling optimization. — Leonardo Alves Machado, Sep 27 '17 at 19:00
I'm almost sure optimization was off... With -O2 (and after fixing it so that the loops aren't optimized away) I'm seeing opposite results. — rustyx, Sep 27 '17 at 19:05
A simple `if` statement consists of a compare instruction (which sets conditional status bits) followed by a conditional jump instruction. The performance issue is that the processor has to change its instruction pointer if the jump is taken. Since your loops are small, they will probably fit into the instruction cache and not need to be reloaded. — Thomas Matthews, Sep 27 '17 at 20:45
Thanks for the comments! Sorry, I forgot to add the compilation options. I have now edited them in. You are correct, I haven't used any compiler optimizations. Using the -O and -O2 flags, the results are consistent with what I'd intuitively expect. This leaves me rather confused though. If all this depends on the type of optimizations the compiler performs, how can I know what to expect when thinking out the code? — Sermal, Sep 27 '17 at 21:02
Possible duplicate of [Why is it faster to process a sorted array than an unsorted array?](https://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array) — dandan78, Sep 27 '17 at 21:02
"Why was it happening when the optimizations were turned off" The compiler still does basic optimizations even with all optimizations disabled. The flags merely control optimizations that are risky or slow. If an optimization is risk-free and fast, the compiler does it at all levels, including "off". — Mooing Duck, Sep 27 '17 at 23:44
Also, when you run the program multiple times, what's the deviation/variance? — Mooing Duck, Sep 27 '17 at 23:44
Running it 676 times, I got the following average values, by order, 0.257923, 0.292771 and 0.238236, with the standard deviations 0.001564, 0.002633 and 0.002267. — Sermal, Sep 28 '17 at 00:07

C++ 'if' branching: runs faster with more branches?

0 Answers0