It is simpler to ask GCC (using GCC7.2 on Linux/Debian/Sid/x86-64) to emit assembler. So I compiled your program bflash.c
with
gcc -fverbose-asm -O0 -S bflash.c -o bflash-O0.S
to get it without optimization, and with
gcc -fverbose-asm -O1 -S bflash.c -o bflash-O1.S
to get -O1
optimization. Feel free to repeat the experiment with various other optimization flags.
Even without optimization, the bflash-O0.S
contains:
.section .rodata
.LC0:
.string "Hello World2\r"
.LC1:
.string "Hello World3\r\n "
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp #
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp #,
.cfi_def_cfa_register 6
# bflash.c:5: printf("Hello World2\r\n");
leaq .LC0(%rip), %rdi #,
call puts@PLT #
# bflash.c:6: printf("Hello World3\r\n ");
leaq .LC1(%rip), %rdi #,
movl $0, %eax #,
call printf@PLT #
# bflash.c:8: return 0;
movl $0, %eax #, _4
# bflash.c:9: }
popq %rbp #
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
As you see, the first printf
has been optimized as a puts
; and this is permitted by the C11 standard n1570 (as-if rule). BTW, the bflash-01.S
contains similar code. Notice that the C11 standard has been specified with current optimization practices in mind (many members of the standardization committees are compiler implementors).
BTW Clang 5, invoked as clang-5.0 -O1 -fverbose-asm -S bflash.c -o bflash-01clang.s
, performs the same kind of optimization.
How can i avoid this kind of "optimization"(!?)
Follow Daniel H's answer (and you might compile with -ffreestanding
, but I don't recommend that).
Or avoid using printf
from the <stdio.h>
and implement your own slower printing function. If you implement your own printing function, name it differently (since printf
is defined in the C11 standard), and perhaps (if so wanted) write your own GCC plugin to optimize it your way (and that plugin should better be some free software which is GPL compatible, read the GCC runtime library exception).
The C language specification (study n1570) defines a semantics, that is the behavior of your compiled program. It does not require any particular sequence of bytes to appear in the executable (which is probably not even mentioned in the standard). If you need such a property, find a different programming language, and give up all the important optimizations GCC is trying hard to do for you. Optimizations are what is making writing a C compiler difficult (if you want a non-optimizing compiler, use something else than GCC, but accept to lose perhaps a factor of three or more in performance, w.r.t. code compiled with gcc -O2
).