Starting a function with a branch

Question

From optimization and branch predictor point of view, is there any difference between those two codes?

First:

void think_and_do(){
    if(expression){
        //Set_A of instructions
    }
    else{
         //Set_B of instructions
    }
}

int main(){
    think_and_do();
}

Second:

void do_A(){
    //Set_A of instructions
}

void do_B(){
    //Set_B of instructions
}

int main(){
    if(expression){
        do_A();
    }
    else{
        do_B();
    }
}

I would think you would get the same code but it depends on the compiler an optimization settings. You could compile both and check the assembly. — NathanOliver, Oct 24 '16 at 12:48
this seems to be a question best answered empirically. iterate a few million times over some repeatable pseudo-random data, and [measure it](http://stackoverflow.com/questions/11437523/can-i-measure-branch-prediction-failures-on-a-modern-intel-core-cpu?rq=1). — Cee McSharpface, Oct 24 '16 at 12:53
One key could be the number of the parameters needed for both functions. First case the function `think_and_do()` needed to have all parameters and Second case only the function `do_A()` or `do_B()` needs all parameters (typically when `do_A()` creates an object and `do_B()` deletes this object). — J. Piquard, Oct 24 '16 at 12:56

score 3 · Answer 1 · answered Oct 25 '16 at 22:24

I've made an test on godbolt.org think_and_do and in main

First observation, if your examples are trivial they mostly get optimized away. Without the cin both examples should have compiled to:

    xor     eax, eax
    add     rsp, 8 #may or may not be present.
    ret

Second observation is that the code is exactly the same in main: and none of the functions are called, everything is inlined.

Third observation is that both examples makes the following code

    mov     edx, DWORD PTR a[rip]
    mov     eax, DWORD PTR b[rip]
    cmp     edx, eax
    je      .L8

That is they fill one cycle of 4 instruction to make the most of issuing (and ignore the possibility of macro-fusion of the cmp and jump).

If they had started with an

    cmp     edx, eax
    je      .L8

Half of the issue bandwidth would potentially have been wasted.

I think the point was about cases where you can't inline the function — Leeor, Oct 28 '16 at 12:07
@Leeor And the conclusion is that the compiler does inline for you even if you dont specify it, which makes it no difference. — Surt, Oct 31 '16 at 06:39

Starting a function with a branch

1 Answers1