The compiler should produce the same assembly for 1 and 2 and it may unroll the loop in option 3. When faced with questions like this, a useful tool you can use to empirically test what's going on is to look at the assembly produced by the compiler. In g++ this can be achieved using the -S
switch.
For example, both options 1 and 2 produce this assembler when compiled with the command g++ -S inc.cpp
(using g++ 4.5.2)
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
addl $5, -4(%rbp)
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
g++ produces significantly less efficient assembler for option 3:
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movl $0, -8(%rbp)
jmp .L2
.L3:
addl $1, -4(%rbp)
addl $1, -8(%rbp)
.L2:
cmpl $4, -8(%rbp)
setle %al
testb %al, %al
jne .L3
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
But with optimisation on (even -O1) g++ produces this for all 3 options:
main:
.LFB0:
.cfi_startproc
leal 5(%rdi), %eax
ret
.cfi_endproc
g++ not only unrolls the loop in option 3, but it also uses the lea instruction to do the addition in a single instruction instead of faffing about with mov
.
So g++ will always produce the same assembly for options 1 and 2. g++ will produce the same assembly for all 3 options only if you explicitly turn optimisation on (which is the behaviour you'd probably expect).
(and it looks like you should be able to inspect the assembly produced by C# too, although I've never tried that)