Does if (k==5) n=k; compile to an equivalent of if (k==5) n=5; in Visual Studio C++?

Question

Does

if (k==5) n=k;

compile to the equivalent of

if (k==5) n=5;

in Visual Studio C++?

Depends on `k`. Usually, it does. If `k` is volatile or atomic, it doesn't. — Aykhan Hagverdili, Mar 06 '21 at 12:42
Basile Starynkevitch I care because it would affect efficiency of smth like this: if (k==5) n=2*k+1; — , Mar 06 '21 at 13:32

score 1 · Answer 1 · answered Mar 06 '21 at 12:52

1

As stated above in the comments what happens depends on K, whether local, global, visible, const etc, but also on your compile optimisation settings. For msvc You could disable optimisation and then look at the generated code and how it differs

answered Mar 06 '21 at 12:52

QuatCoder

232
1
5

You are right. I just thought a lot of people must have done it in the past and those individuals might give an answer right away. All vars are local. – Mar 06 '21 at 13:36
@Lavroff micro optimisations like this are usually only considered after extensive measuring and profiling. Usually there are many more profitable optimisations such as changing the algorithm(s) and optimising memory layouts etc – Richard Critten Mar 07 '21 at 12:22

vandench · Answer 2 · 2021-03-10T20:06:42.500

Not necessarily. There are a lot of things to consider when optimizing assembly. To name a few you have to know how many bytes the binary takes up, branching, instruction execution time and throughput.

From a simple test of

int test(int k) {
    int n = 0;

    if(k == 5) {
        n = k;
    }

    return n;
}

we get this assembly

int test(int):
    xor     eax, eax  ; int n = 0;  Zero out eax, the return register
    cmp     ecx, 5    ; if(k == 5); Compare edi (the register taking in the input) to 5
    cmovne  ecx, eax  ;   k = 0;    If ecx (k) was not equal to 5, zero it out
    mov     eax, ecx  ; n = k;      Set eax (the return value) to ecx (k)
    ret               ; return n;

As we can see this is not moving literal 5 into the n. In fact it's conditionally moving 0 into k, and then moving k into n.

To understand why we need to look at another piece of assembly

xor eax, eax ; Zero out eax, the return register.
cmp ecx, 5   ; Compare ecx ('k') to 5
jne Return   ; If k is not equal to five, jump to return
mov eax, 5   ; Move 5 into the return register
Return:
ret          ; Return

This is the same number of instructions, but the compiler can see some glaring problems. The first is the conditional jump, which can suffer from a branch prediction failure. The next is that while this is the same number of instructions, it takes more bytes to represent those instructions.

The two differing instructions that MSVC produces are

0f 45 c8     cmovne ecx, eax
89 c8        mov eax, ecx

whereas the jumping version produces

75 05             jne Return
b8 05 00 00 00    mov eax, 5

As we can see MSVC's version requires 2 fewer bytes to represent. The version you describe is bad for the instruction cache. Getting into the details of cache lines, memory loads, instruction decoding, and instruction scheduling would take hours. But it boils down to this: keep data small, use as little of a cache line as possible for a single entry, pack as much other data entries as you can into the cache lines. You might wonder why moving 5 into eax takes so many more bytes to represent, the answer is that it is moving a DWORD (4 byte integer) into the register, and to move a DWORD, you need 4 bytes. Whereas with the moving between registers, there are only a handful of registers accessible in your CPU, and thus they can be represented in a single byte.

There are a lot of other optimizations we can and cannot see here. For instance, zeroing data is not done with a mov, but an xor. This is because xor'ing something against eax is such a common task that it only takes one byte to represent the mnemonic, plus one more byte for the other register. Additionally, this xor'ing practice is so common that it is almost guaranteed that your CPU is decoding it into a very simple move, again going over instruction decoding and scheduling would take hours.

Some other things that were optimized away were pushing a stack frame, and using the stack for mutable storage.

In the end, you shouldn't worry about things like this. There are times where micro-optimization is important, but usually the compiler knows better than you. They've read the C++ spec, they've read the x86 software developers guide, and they work directly with CPU manufactures to improve their compilers.

I highly recommend Matt Godbolt's (the guy who made the compiler explorer that let's me easily view the generated assembly) talk What has my compiler done for me lately if you are interested in learning some of the nuances a compiler considers when optimizing code.

Does if (k==5) n=k; compile to an equivalent of if (k==5) n=5; in Visual Studio C++?

2 Answers2