Comparison of godbolt assembly of basic C program

Question

I've written the following basic C program:

int main() {
    char a = 1;
    char b = 5;
    return a + b;
}

And it compiles in godbolt as:

main:
  pushq %rbp
  movq %rsp, %rbp
  movb $1, -1(%rbp)
  movb $5, -2(%rbp)
  movsbl -1(%rbp), %edx
  movsbl -2(%rbp), %eax
  addl %edx, %eax
  popq %rbp
  ret

I have a few questions about the compiled asm:

Is movb used for 1byte (char), movw for 2byte (short), movl for 4byte (int), and movq for 8byte (int) integers? What then is just mov used for, without an extension?
Why is an offset used for movb $1 -1(%rbp), movb $5 -2(%rbp)? Why aren't the two numbers just moved into two different registers? For example, there's an addl %edx, %eax later on...why aren't the two numbers just moved into those two registers?
Why is movsbl used here? Why aren't the numbers just moved directly into the registers?
Is pushq / popq pushing/popping an 8byte pointer onto the stack? If so, what's the point of the movq %rsp, %rbp?

Also it's worth turning on optimization. Then the result will be `main: movl $6, %eax; ret` — MikeCAT, Aug 02 '20 at 01:21
@MikeCAT -- thanks, why `movl` instead of `movb` if the number is 6? — David542, Aug 02 '20 at 01:23
Because the return value is an `int`. It would be an error to leave the high 3 byte of EAX holding whatever garbage main's caller left there. — Peter Cordes, Aug 02 '20 at 01:26
As for why store to memory them movsbl: because they're separate statements and you compiled without optimization. [Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?](https://stackoverflow.com/q/53366394) — Peter Cordes, Aug 02 '20 at 01:28
@PeterCordes thanks, what is the "l" (lowercase L) in `movl` stand for? I would think it is long, but long is 8 bytes (or maybe that's `long long`)? — David542, Aug 02 '20 at 01:29
*What then is just mov used for, without an extension?* - nothing. Instructions always has an operand-size. Compilers choose to make that explicit. But you can omit the AT&T operand-size suffix when it's implied by a register operand, like in all of these cases. For `mov` it's only necessary for `mov $imm, (mem)` where neither operand is a register; with no suffix the size is ambiguous. — Peter Cordes, Aug 02 '20 at 01:30
@David542: asm names for widths were established back in 32-bit days, when 386 was new. asm `long` is a dword, C `int32_t`, and what used to be C `long` in `gcc -m32` 32-bit mode. — Peter Cordes, Aug 02 '20 at 01:32
@PeterCordes -- I see, thanks. And then `q` is quadWord? = 8 bytes? — David542, Aug 02 '20 at 01:34
This is a duplicate of like 3 or 4 different questions. Read some tutorials and/or the GAS manual, but avoid asking 3 unrelated questions in one post, even if they're about the same code. (And yes, of course `q` is quadword, that's why GCC is using it on instructions with 64-bit register operands. Note that `int` is not 8 bytes, that's `long` or `long long`, or `void*`) — Peter Cordes, Aug 02 '20 at 01:34
More duplicates that SO's dup list didn't have room for: [Questions about AT&T x86 Syntax design](https://stackoverflow.com/q/4193827) / [What is the purpose of the RBP register in x86\_64 assembler?](https://stackoverflow.com/a/41914096) (RBP as a frame pointer) / [Why does %rbp point to nothing?](https://stackoverflow.com/q/44687662) — Peter Cordes, Aug 02 '20 at 01:39
If there's anything that's not a duplicate of those links, I guess ask a new question if you can't answer it yourself with some research. — Peter Cordes, Aug 02 '20 at 01:40

Comparison of godbolt assembly of basic C program

0 Answers0