0

I'm using a AMD64 computer(Intel Pentium Gold 4415U) to compare some assembly instructions converted from C language(of course, exactly, disassembly).

With Windows 10, I used Visual Studio 2017(15.2) with their C compiler. My example code is shown below:

int main() {
    int i = 0;
    if(++i == 4);
    if(i++ == 4);
    return 0;
}

The disassembly shows as below:

mov         eax,dword ptr [i]  // if (++i == 4);
inc         eax  
mov         dword ptr [i],eax  

mov         eax,dword ptr [i]  // if (i++ == 4);
mov         dword ptr [rbp+0D4h],eax    ; save old i to a temporary
mov         eax,dword ptr [i]  
inc         eax  
mov         dword ptr [i],eax  
cmp         dword ptr [rbp+0D4h],4      ; compare with previous i
jne         main+51h (07FF7DDBF3601h)  
mov         dword ptr [rbp+0D8h],1  
jmp         main+5Bh (07FF7DDBF360Bh)  
*mov         dword ptr [rbp+0D8h],0

07FF7DDBF3601 goes to the last line instruction(*).
07FF7DDBF360B goes to 'return 0;'.

In if (++i == 4), the program doesn't observes whether 'added' i satisfies the condition.

However in if (i++ == 4), the program saves the 'previous' i to the stack, and then does the increment. After, the program compare 'previous' i with the constant integer 4.

What was the cause of the difference of two C codes? Is it just a compiler's mechanism? Will it be different with more complex code?

I tried to find about this with Google, however I failed to find the origin of the difference. Have to I understand 'This is just a compiler behavior'?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Smells like optimization. – Paul Ogilvie Apr 05 '19 at 10:50
  • The code is equivalent to `return 0;` Anything else may be optimized away by the compiler. – Paul Ogilvie Apr 05 '19 at 10:51
  • 1
    @PaulOgilvie: except it's *not* optimized. This is clearly un-optmized compiler output, which compiles each C statement separately, not doing any constant-propagation and effectively treating all locals as `volatile` so the program still works if a debugger *modifies* them. Of course it can still optimize *within* a statement, and remove a jump over an empty `if` body. – Peter Cordes Apr 05 '19 at 10:52
  • Why are you looking at debug mode assembly? What insights are you trying to gain from this? – n. m. could be an AI Apr 05 '19 at 11:43
  • It isn't quite clear what he means. Of course they work differently, that's the reason why there are two operations. We don't need two operations that work the same, now do we? – n. m. could be an AI Apr 05 '19 at 12:16
  • Yes `++i` *sometimes can* be faster, but *debug mode assembly* is not exactly the right place to look for faster things. – n. m. could be an AI Apr 05 '19 at 12:30

1 Answers1

2

Like Paul says, the program has no observable side-effects, and with optimization enabled MSVC or any of the other major compilers (gcc/clang/ICC) will compile main to simply xor eax,eax / ret.

i's value never escapes the function (not stored to a global or returned), so it can be optimized away entirely. And even if it was, constant-propagation is trivial here.


It's just a quirk / implementation detail that MSVC's debug-mode anti-optimized code-gen decides not to emit a cmp/jcc over an empty if body; even in debug mode that wouldn't be helpful for debugging at all. It would be a branch instruction that jumps to the same address it falls through to.

The point of debug-mode code is that you can single-step through source lines, and modify C variables with a debugger. Not that the asm is a literal and faithful transliteration of C into asm. (And also that the compiler generates it quickly, without spending any effort on quality, to speed up edit/compile/run cycles.) Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?

Exactly how braindead the compiler's code-gen is doesn't depend on any language rules; there are no actual standards that define what compilers have to do in debug-mode as far as actually using a branch instruction for an empty if body.


Apparently with your compiler version, the i++ post-increment was enough to make the compiler forget that the loop body was empty?

I can't reproduce your result with MSVC 19.0 or 19.10 on the Godbolt compiler explorer, with 32 or 64-bit mode. (VS2015 or VS2017). Or any other MSVC version. I get no conditional branches at all from MSVC, ICC, or gcc.

MSVC does implement i++ with an actual store to memory for the old value, like you show, though. So terrible. GCC -O0 makes significantly more efficient debug-mode code. Still pretty braindead of course, but within a single statement it's sometimes a lot less bad.

I can reproduce it with clang, though! (But it branches for both ifs):

# clang8.0 -O0
main:                                   # @main
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 4], 0       # default return value

        mov     dword ptr [rbp - 8], 0       # int i=0;

        mov     eax, dword ptr [rbp - 8]
        add     eax, 1
        mov     dword ptr [rbp - 8], eax
        cmp     eax, 4                       # uses the i++ result still in a register
        jne     .LBB0_2                      # jump over if() body
        jmp     .LBB0_2                      # jump over else body, I think.
.LBB0_2:

        mov     eax, dword ptr [rbp - 8]
        mov     ecx, eax
        add     ecx, 1                       # i++ uses a 2nd register
        mov     dword ptr [rbp - 8], ecx
        cmp     eax, 4
        jne     .LBB0_4
        jmp     .LBB0_4
.LBB0_4:

        xor     eax, eax                     # return 0

        pop     rbp                          # tear down stack frame.
        ret
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847