2

For my homework assignment I am supposed to convert this C code

 #define UPPER 15
 const int lower = 12;

 int sum = 0;

 int main(void) {
   int i;
   for (i = lower; i < UPPER; i++) {
     sum += i;
   }
   return sum;
 }

into gcc assembly. I already compiled it to first study the code before doing it per hand (obviously translating by hand is going to look much differently). This is the assembler code I received:

.file   "upper.c"
.globl  lower
.section    .rodata
.align 4
.type   lower, @object
.size   lower, 4
    lower:
.long   12
.globl  sum
.bss
.align 4
.type   sum, @object
.size   sum, 4
     sum:
.zero   4
.text
.globl  main
.type   main, @function
    main:
    .LFB0:
.cfi_startproc
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq    %rsp, %rbp
.cfi_def_cfa_register 6
movl    $12, -4(%rbp)
jmp .L2
    .L3:
movl    sum(%rip), %edx
movl    -4(%rbp), %eax
addl    %edx, %eax
movl    %eax, sum(%rip)
addl    $1, -4(%rbp)
    .L2:
cmpl    $14, -4(%rbp)
jle .L3
movl    sum(%rip), %eax
popq    %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
    .LFE0:
.size   main, .-main
.ident  "GCC: (SUSE Linux) 4.8.1 20130909 [gcc-4_8-branch revision 202388]"
.section    .note.GNU-stack,"",@progbits

Now I was wondering if someone could give me a few examples like

  • where the constructors i, lower, upper and sum are located it in code
  • where some of the expressions i = lower or i < UPPER are located
  • where the for-loop starts

and such things so I can then get an idea of how the assembler code is constructed. Thank you!

ec-m
  • 779
  • 1
  • 5
  • 15
  • 1
    Compile your code with `gcc -fverbose-asm -S ` – nos May 20 '14 at 19:39
  • This [post](http://stackoverflow.com/a/1289907/1101537) describes how to get readable disassembly. – alexander May 20 '14 at 19:41
  • possible duplicate of [Using GCC to produce readable assembly?](http://stackoverflow.com/questions/1289881/using-gcc-to-produce-readable-assembly) – mjs May 20 '14 at 19:43
  • @nos: Thank you - that is already a little better with the comments on the side. – ec-m May 20 '14 at 19:52
  • @alexander: I already compiled the code into disassembly too, however, I don't really understand that either: What do the different sections stand for ("Disassembly section of .plt, .init, etc) and what do the four blocks portray in for ex: 4003c0: 48 83 ec 08 sub $0x8,%rsp – ec-m May 20 '14 at 19:52
  • 1
    @eva, that sections is used by gcc compiler to prepare C-program runtime environment before it enter main(): initialize global variables (like 'int sum = 0' - .init), hold constant variables (.const) and do another things. Just skip them to main() to see how main() is compiled into assembly. – alexander May 20 '14 at 20:16
  • 2
    See the post [Linux x86 Program Start Up or - How the heck do we get to main()?](http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html) to understand how global variables get initialized in C-programs. I don't understand your goals. What do you want to achieve: write custom asm-program for linux or understand C-runtime background? – alexander May 20 '14 at 20:46
  • @alexander: My goal is to compile the C code into gcc assemlby by hand. That is why I wanted someone to point out a few of the things I listed (i, sum, lower, i = lower, etc) in the already compiled assembler so I could get the hang of it. Your post was already really helpful understanding the structure of assembly but could you also point out a couple of the things in my code - it is still a little unclear to me... – ec-m May 21 '14 at 07:21

1 Answers1

3

If I understood correctly you question, here is the answers:

Q: where the constructors i, lower, upper and sum are located it in code?

lower is located inside .rodata section (readonly data section). It's value is initialized by linux loader during program loading stage to the value .long 12. lower constructor is a linux loader. It just loads lower value from binary image.

.globl  lower
.section    .rodata
.align 4
.type   lower, @object
.size   lower, 4
    lower:
.long   12

sum is located inside .bss section (data segment containing statically-allocated variables). It's value is initialized by _init function what gets called when program execution begins. It's value is zero (.zero 4). Every variable located inside .bss section has zero as initial value (link to wiki's article for .bss).

.globl  sum
.bss
.align 4
.type   sum, @object
.size   sum, 4
     sum:
.zero   4

upper is a constant. The compiler did not put it's declaration into assembly. There is a reference to upper-1 (as $14) here:

    .L2:
cmpl    $14, -4(%rbp)

i is a on stack temporary variable. It's value is accessed using addresses relative %rbp (%rbp is a pointer to current function stack frame). The is no explicit declaration of i into assembly. There is no explicit stack reservation for i (no instruction like sub $0x8,%rsp at main preamble), I think, because main doesn't call another functions. Here is code for i initialization (note compiler knows that lower initial value is $12 and removes access to lower during i initialization):

movl    $12, -4(%rbp)

Q: where some of the expressions i = lower or i < UPPER are located

i = lower:

movl    $12, -4(%rbp)
jmp .L2

i < UPPER:

    .L2:
cmpl    $14, -4(%rbp)
jle .L3

i++:

addl    $1, -4(%rbp)

sum += i;:

movl    sum(%rip), %edx
movl    -4(%rbp), %eax
addl    %edx, %eax
movl    %eax, sum(%rip)

return sum; (%eax register is used to hold function return value - more about this: X86 calling conventions):

jle .L3
movl    sum(%rip), %eax
popq    %rbp
.cfi_def_cfa 7, 8
ret

Q: where the for-loop starts

it start here:

movl    $12, -4(%rbp)
jmp .L2
alexander
  • 2,703
  • 18
  • 16
  • Awesome - thank you so much! I noticed due to your answer that I had misinterpreted a lot in the code but now it is much clearer! – ec-m May 21 '14 at 13:11