3

I am trying to understand the disassembled version of this program:

#include <stdio.h>

int main(){
   int i=0;
   printf("HELLO VIK");
   return 0;
}

gdb disassembly:

(gdb) disass main
Dump of assembler code for function main:
0x0000000100000ef0 <main+0>:    push   rbp
0x0000000100000ef1 <main+1>:    mov    rbp,rsp
0x0000000100000ef4 <main+4>:    sub    rsp,0x10
0x0000000100000ef8 <main+8>:    mov    DWORD PTR [rbp-0xc],0x0
0x0000000100000eff <main+15>:   xor    al,al
0x0000000100000f01 <main+17>:   lea    rcx,[rip+0x50]        # 0x100000f58
0x0000000100000f08 <main+24>:   mov    rdi,rcx
0x0000000100000f0b <main+27>:   call   0x100000f2c <dyld_stub_printf>
0x0000000100000f10 <main+32>:   mov    DWORD PTR [rbp-0x8],0x0
0x0000000100000f17 <main+39>:   mov    eax,DWORD PTR [rbp-0x8]
0x0000000100000f1a <main+42>:   mov    DWORD PTR [rbp-0x4],eax
0x0000000100000f1d <main+45>:   mov    eax,DWORD PTR [rbp-0x4]
0x0000000100000f20 <main+48>:   add    rsp,0x10
0x0000000100000f24 <main+52>:   pop    rbp
0x0000000100000f25 <main+53>:   ret

If I understand the first 3 lines correctly, the base pointer is being pushed to the stack as the return address. Then the base pointer is set to the current stack pointer. The size of the stack is set to 16 bytes (x10). The size of the int i is 12 bytes(0xc) and is set to 0. I'm not sure what (xor al, al) does. Did i interpet this correctly? What does the xor al, al line do?

vikash dat
  • 1,494
  • 2
  • 19
  • 37
  • possible duplicate of [Why is %eax zeroed before a call to printf?](http://stackoverflow.com/questions/6212665/why-is-eax-zeroed-before-a-call-to-printf) and http://stackoverflow.com/questions/1396527/any-reason-to-do-a-xor-eax-eax – Ciro Santilli OurBigBook.com Jul 08 '15 at 15:05

1 Answers1

5

xor al,al is a quick way to zero out a register. It's a one-byte opcode in x86 assembler, v.s. 2bytes for mov al, 0.

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • 1
    so AL is the low bits for the accumulator register, but why is it being zero'd out? – vikash dat Aug 14 '12 at 14:30
  • No idea. It could be the `int = 0`. Though since `i` isn't actually being used anywhere else in that simple program, it should've been optimized away. – Marc B Aug 14 '12 at 14:37
  • 1
    It indicates the number of values printf should expect. See [Why is EAX zeroed before a call to printf](http://stackoverflow.com/questions/6212665/why-is-eax-zeroed-before-a-call-to-printf) – Bo Persson Aug 14 '12 at 15:00
  • thanks Bo Persson, this explains alot...now I just have to understand vector registers – vikash dat Aug 14 '12 at 15:41
  • 1
    What I find more surprising is that it's `al`, rather than `eax` proper, that is being set to zero. It's two bytes in either case, `30 C0` or `31 C0`, and `xor al, al` seems both slow (dependency) and mildly dangerous (what if printf looks at more than just the low byte?) to me. – harold Aug 14 '12 at 15:59
  • I'd hate to see a printf that requires 4 billion formatting chars... 256 would seem to be enough to me... – Marc B Aug 14 '12 at 16:45
  • Me too, but that could just mean that you have to supply it with a lower number instead of printf silently ANDing its argument with 255 - that makes even less sense to me. And anyway that still doesn't explain why it zeroes just `al` and not the rest of `eax` as well. – harold Aug 14 '12 at 16:48
  • @harold: For the record, the ABI requires the count of XMM args to be in AL; the rest of RAX can hold garbage, and the value in AL must be in the range 0..8. (Older versions of the ABI said RAX in a couple places, but current versions are totally clear that it's only AL). And yes, it's 100% normal to zero the whole RAX with `xor eax,eax` because that's more efficient, avoiding a false dependency. The code in the question might be from older clang that didn't know better; clang still likes partial regs. – Peter Cordes Mar 11 '21 at 13:00
  • And yes, this answer is still wrong. All 3 ways mentioned are 2 bytes (xor eax,eax / xor al,al / mov al,0). xor al,al is probably the worst choice: not a zeroing idiom on any CPU, except maybe P6 family that renames low-8 regs separately from the full register, but even then IDK. `mov al,0` has a false dependency on many CPUs, but at least doesn't read the old AL if it's not special-cased. ([What is the best way to set a register to zero in x86 assembly: xor, mov or and?](https://stackoverflow.com/a/33668295)) – Peter Cordes Mar 11 '21 at 13:03