I am trying to learn how assembly works at an elementary level and so I have been playing with the -S
output of gcc compilations. I wrote a simple program that defines two bytes and returns their sum. The entire program follows:
int main(void) {
char A = 5;
char B = 10;
return A + B;
}
When I compile this with no optimizations using:
gcc -O0 -S -c test.c
I get test.s that looks like the following:
.file "test.c"
.def ___main; .scl 2; .type 32; .endef
.text
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $16, %esp
call ___main
movb $5, 15(%esp)
movb $10, 14(%esp)
movsbl 15(%esp), %edx
movsbl 14(%esp), %eax
addl %edx, %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE0:
.ident "GCC: (GNU) 4.9.2"
Now, recognizing that this program can very easily be simplified to just return a constant (15) I have been able to reduce the assembly by hand to perform the same function using this code:
.global _main
_main:
movl $15, %eax
ret
This appears to me to be the least amount of code possible (but I realize could be quite wrong) to perform this admittedly trivial task. Is this form the most "optimized" version of my C program?
Why is the initial output of GCC so much more verbose? What do the lines spanning from .cfi_startproc
to call __main
even do? What does call __main
do? I cannot figure what the two subtraction operations are for.
Even with optimizations in GCC set to -O3
I get this:
.file "test.c"
.def ___main; .scl 2; .type 32; .endef
.section .text.unlikely,"x"
LCOLDB0:
.section .text.startup,"x"
LHOTB0:
.p2align 4,,15
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
call ___main
movl $15, %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE0:
.section .text.unlikely,"x"
LCOLDE0:
.section .text.startup,"x"
LHOTE0:
.ident "GCC: (GNU) 4.9.2"
Which seems to have removed a number of operations, but still leaves all the lines leading to call __main
that seems unnecessary. What are all the .cfi_XXX
lines for? Why are so many labels added? What do .section
, .ident
, .def .p2align
, etc. do?
I understand that many of the labels and symbols are included for debugging, but shouldn't these be stripped or omitted if I am not compiling with -g enabled?
UPDATE
To clarify, by saying
This appears to me to be the least amount of code possible (but I realize could be quite wrong) to perform this admittedly trivial task. Is this form the most "optimized" version of my C program?
I am not suggesting that I am trying to, or have achieved, an optimized version of this program. I realize the program is useless and trivial. I am just using it as a tool to learn assembly and how the compiler works.
The core of why I added this bit is to illustrate why I am confused that the 4 line version of this assembly code can effectively achieve the same effect as the others. It seems to me that GCC has added a lot of "stuff" whose purpose I cannot discern.