5

From optimization and branch predictor point of view, is there any difference between those two codes?

First:

void think_and_do(){
    if(expression){
        //Set_A of instructions
    }
    else{
         //Set_B of instructions
    }
}

int main(){
    think_and_do();
}

Second:

void do_A(){
    //Set_A of instructions
}

void do_B(){
    //Set_B of instructions
}

int main(){
    if(expression){
        do_A();
    }
    else{
        do_B();
    }
}
Humam Helfawi
  • 19,566
  • 15
  • 85
  • 160
  • 4
    I would think you would get the same code but it depends on the compiler an optimization settings. You could compile both and check the assembly. – NathanOliver Oct 24 '16 at 12:48
  • 1
    this seems to be a question best answered empirically. iterate a few million times over some repeatable pseudo-random data, and [measure it](http://stackoverflow.com/questions/11437523/can-i-measure-branch-prediction-failures-on-a-modern-intel-core-cpu?rq=1). – Cee McSharpface Oct 24 '16 at 12:53
  • 1
    One key could be the number of the parameters needed for both functions. First case the function `think_and_do()` needed to have all parameters and Second case only the function `do_A()` or `do_B()` needs all parameters (typically when `do_A()` creates an object and `do_B()` deletes this object). – J. Piquard Oct 24 '16 at 12:56

1 Answers1

3

I've made an test on godbolt.org think_and_do and in main

First observation, if your examples are trivial they mostly get optimized away. Without the cin both examples should have compiled to:

    xor     eax, eax
    add     rsp, 8 #may or may not be present.
    ret 

Second observation is that the code is exactly the same in main: and none of the functions are called, everything is inlined.

Third observation is that both examples makes the following code

    mov     edx, DWORD PTR a[rip]
    mov     eax, DWORD PTR b[rip]
    cmp     edx, eax
    je      .L8

That is they fill one cycle of 4 instruction to make the most of issuing (and ignore the possibility of macro-fusion of the cmp and jump).

If they had started with an

    cmp     edx, eax
    je      .L8

Half of the issue bandwidth would potentially have been wasted.

Surt
  • 15,501
  • 3
  • 23
  • 39
  • 1
    I think the point was about cases where you can't inline the function – Leeor Oct 28 '16 at 12:07
  • @Leeor And the conclusion is that the compiler does inline for you even if you dont specify it, which makes it no difference. – Surt Oct 31 '16 at 06:39