1

I stumbled upon this question/answer which mentions that in most languages, logical operators such as:

x == y && doSomething();

can be faster than doing the same thing with an if branch:

if(x == y) {
  doSomething();
}

Similarly, it says that the ternary operator:

x = y == z ? 0 : 1

is usually faster than using an if branch:

if(y == z) {
  x = 0;
} else {
  x = 1;
}

This got me Googling, which led me to this fantastic answer which explains branch prediction.

Basically, what it says is that the CPU operates at very fast speeds, and rather than slowing down to compute every if branch, it tries to guess what outcome will take place and places the appropriate instructions in its pipeline. But if it makes the wrong guess, it will have to back up and recompute the appropriate instructions.

But this still doesn't explain to me why logical operators or the ternary operator are treated differently than if branches. Since the CPU doesn't know the outcome of x == y, shouldn't it still have to guess whether to place the call to doSomething() (and therefore, all of doSomething's code) into its pipeline? And, therefore, back up if its guess was incorrect? Similarly, for the ternary operator, shouldn't the CPU have to guess whether y == z will evaluate to true when determining what to store in x, and back up if its guess was wrong?

I don't understand why if branches are treated any differently by the compiler than any other statement which is conditional. Shouldn't all conditionals be evaluated the same way?

Community
  • 1
  • 1
A. Duff
  • 4,097
  • 7
  • 38
  • 69
  • Maybe it has to do something that first statement is expression and the other is block of code meaning jmp. – fsacer Jun 03 '15 at 13:51
  • This is more of a compiler question than a CPU question. Ternary operators are functionally almost the same as branches when used this way. But for some reason, major compilers like MSVC and GCC are more likely to issue a conditional branch when the ternary operator is used. – Mysticial Jun 03 '15 at 14:29
  • 1
    Boolean operators are often compiled as branches. The CPU cannot tell the difference. But then, some branches are compiled into branch-free instructions as well. Depends on the quality of the optimizer. If we are talking about C here the compiler has this handled for you. – usr Jun 03 '15 at 18:56

1 Answers1

3

Short answer - it simply isn't. While helping branch prediction could improve you performance - using this as a part a logical statement doesn't change the compiled code. If you want to help branch prediction use __builtin_expect (for GNU)

To emphasize let's compare the compiler output:

#include <stdio.h>


int main(){
        int foo;

        scanf("%d", &foo); /*Needed to eliminate optimizations*/

#ifdef IF       
        if (foo)
                printf("Foo!");
#else
        foo &&  printf("Foo!");
#endif 
        return 0;
}

For gcc -O3 branch.c -DIF We get:

0000000000400540 <main>:
  400540:       48 83 ec 18             sub    $0x18,%rsp
  400544:       31 c0                   xor    %eax,%eax
  400546:       bf 68 06 40 00          mov    $0x400668,%edi
  40054b:       48 8d 74 24 0c          lea    0xc(%rsp),%rsi
  400550:       e8 e3 fe ff ff          callq  400438 <__isoc99_scanf@plt>
  400555:       8b 44 24 0c             mov    0xc(%rsp),%eax
  400559:       85 c0                   test   %eax,%eax #This is the relevant part
  40055b:       74 0c                   je     400569 <main+0x29>
  40055d:       bf 6b 06 40 00          mov    $0x40066b,%edi
  400562:       31 c0                   xor    %eax,%eax
  400564:       e8 af fe ff ff          callq  400418 <printf@plt>
  400569:       31 c0                   xor    %eax,%eax
  40056b:       48 83 c4 18             add    $0x18,%rsp
  40056f:       c3                      retq 

And for gcc -O3 branch.c

0000000000400540 <main>:
  400540:       48 83 ec 18             sub    $0x18,%rsp
  400544:       31 c0                   xor    %eax,%eax
  400546:       bf 68 06 40 00          mov    $0x400668,%edi
  40054b:       48 8d 74 24 0c          lea    0xc(%rsp),%rsi
  400550:       e8 e3 fe ff ff          callq  400438 <__isoc99_scanf@plt>
  400555:       8b 44 24 0c             mov    0xc(%rsp),%eax
  400559:       85 c0                   test   %eax,%eax
  40055b:       74 0c                   je     400569 <main+0x29>
  40055d:       bf 6b 06 40 00          mov    $0x40066b,%edi
  400562:       31 c0                   xor    %eax,%eax
  400564:       e8 af fe ff ff          callq  400418 <printf@plt>
  400569:       31 c0                   xor    %eax,%eax
  40056b:       48 83 c4 18             add    $0x18,%rsp
  40056f:       c3                      retq 

This is exactly the same code.

The question you linked to measures performance for JAVAScript. Note that there it may be interpreted (since Java script is interpreted or JIT depending on the version) to something different for the two cases. Anyway JavaScript is not the best for learning about performance.

kipodi
  • 364
  • 1
  • 7