0

I've compiled the following C function to see non-local variables are handled in asm:

int a=1;
int b=2;

int main() {
    int c = 3;
    return a+b+c;
}

Compiling it with $ gcc file.c -S gives me:

    .file   "isec.c"
    .globl  a
    .data
    .align 4
    .type   a, @object
    .size   a, 4
a:
    .long   1
    .globl  b
    .align 4
    .type   b, @object
    .size   b, 4
b:
    .long   2
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp

I'm a bit confused about the ordering of the items above LFB0. For example, why isn't it just:

a: .long 1
b: .long 2

What is everything else for? And, if it is going to list a bunch of globals, why doesn't it do something like:

.globl a
.globl b
.globl main

I guess I'm confused about the ordering and why the top sections are organized like they are.

I am incredibly new to asm but I guess how I would think it would be compiled would be along the lines of:

.globl a
.globl b
.globl main

a: .long 1
b: .long 2

main:
    mov     a(%rip),    %eax
    add     b(%rip),    %eax
    add     $3,         %eax
    ret

Additionally, is it required to make a and b global? (What's the point of doing global? All I can tell is it's required for the main function).

David542
  • 104,438
  • 178
  • 489
  • 842
  • 2
    Clearly `int a=1` produced the `a` variable and all of the metadata (global, alignment, debuginfo) about it. Why would a compiler store all of that up, just to emit it in a different order? The assembly is formatted for the consumption of the assembler first, and the human second. I'm not really sure the point of your question: the only reasonable answer is "because that's the way the compiler is implemented". – Jonathon Reinhart Aug 04 '20 at 02:20
  • 1
    GCC doesn't try to optimize by leaving out `.align` - I'm not sure it really tracks alignment at all, just has rules for when to emit `.align`. It does remember what section it's in, so it can omit `.data` for the 2nd one. If you were writing a compiler, surely it would make the most sense to emit all the directives for one static object as a group, instead of looping over static objects multiple times for different kinds of directives. If you were writing by hand, yes your way looks like reasonable style, although I might prefer keeping `.globl` next to the var. – Peter Cordes Aug 04 '20 at 02:21
  • @JonathonReinhart just trying to understand what the different parts are. – David542 Aug 04 '20 at 02:22
  • @PeterCordes is it required to make `a` and `b` global? It doesn't do that when I compile it in godbolt, but it does when I do it in gcc. – David542 Aug 04 '20 at 02:24
  • @PeterCordes and where could I learn more about the different metadata fields, such as `.align`, `.globl`, etc. ? – David542 Aug 04 '20 at 02:26
  • Yes, when compiling C source that does `int a=1; int b=2;` at global scope without the `static` keyword, of course `.globl` is required; other object files can access those symbols. Re: Godbolt: you're forgetting the answer to [Running an external compiler (godbolt) assembly](https://stackoverflow.com/q/63208881) which you asked a couple days ago: GCC on godbolt *does* do that, the scripts just filter it out by default as noise. – Peter Cordes Aug 04 '20 at 02:27
  • You can learn more about GAS directives in the GAS manual, https://sourceware.org/binutils/docs/as/ which multiple people have linked you multiple times. The documentation there is not super-detailed, except for `.align` which is only relevant to the assembler. For their effect on ELF metadata, I guess you could read the ELF spec. – Peter Cordes Aug 04 '20 at 02:28
  • @PeterCordes sure, thanks for the link. But looking at something like this -- https://sourceware.org/binutils/docs/as/Global.html#Global -- isn't too helpful for someone that is almost a total beginner like I am. – David542 Aug 04 '20 at 02:30
  • I don't see what's unclear about that. You've used C and/or C++, right, so you know the basics of how linking works, and the difference between `static int a=1;` (per file private) vs. `int a=1` (global), right? See https://en.wikipedia.org/wiki/Symbol_table for more about what a symbol table is. – Peter Cordes Aug 04 '20 at 02:33
  • 1
    You can infer a lot about how things work from how GCC uses them to implement different C semantics, like `static` vs. non-static. Some of the metadata like `.size` and `.type` are pretty obscure (and you generally don't need them in hand-written asm), but the basic purpose should be obvious from the name. – Peter Cordes Aug 04 '20 at 02:37
  • 1
    @PeterCordes oh, I see now with the global -- thanks! I compiled a program with `int a=1; static int b=2;` and that kind of showed the difference and usage of `global`. Thanks for pointing that out. – David542 Aug 04 '20 at 02:38
  • > *is it required to make a and b global?* It is required if you ask the compiler to do so, which you most certainly did. – Waqar Aug 04 '20 at 11:41
  • @David542 godbolt cleans up the assembly for you so that you can focus on the important stuff. If you want to know how, read [how-to-generate-godbolt-like-clean-assembly-locally](https://stackoverflow.com/questions/63015986/how-to-generate-godbolt-like-clean-assembly-locally) – Waqar Aug 04 '20 at 11:46

0 Answers0