I am currently practicing with assembly reading by disassemblying C programs and trying to understand what they do.
I am stuck with a trivial one: a simple hello world program.
#include <stdio.h>
#include <stdlib.h>
int main() {
printf("Hello, world!");
return(0);
}
When I disassemble the main:
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400526 <+0>: push rbp
0x0000000000400527 <+1>: mov rbp,rsp
0x000000000040052a <+4>: mov edi,0x4005c4
0x000000000040052f <+9>: mov eax,0x0
0x0000000000400534 <+14>: call 0x400400 <printf@plt>
0x0000000000400539 <+19>: mov eax,0x0
0x000000000040053e <+24>: pop rbp
0x000000000040053f <+25>: ret
I understand the first two lines: the base pointer is saved on the stack (by push rbp, which causes the value of the stack pointer to be decreased by 8, because it has "grown") and the value of the stack pointer is saved in the base pointer (so that parameters and local variable can be easily reached through positive and negative offsets, respectively, while the stack can keep "growing").
The third line presents the first issue: why is 0x4005c4 (the address of the "Hello, World!" string) moved in the edi register instead of moving it on the stack? Shouldn't the printf function take the address of that string as parameter? For what I know, functions take parameters from the stack (but here, it looks like the parameter is put in that register: edi)
On another post here on StackOverflow I read that "printf@ptl" is like a stub function that calls the real printf function. I tried to disassemble that function, but it gets even more confusing:
(gdb) disassemble printf
Dump of assembler code for function __printf:
0x00007ffff7a637b0 <+0>: sub rsp,0xd8
0x00007ffff7a637b7 <+7>: test al,al
0x00007ffff7a637b9 <+9>: mov QWORD PTR [rsp+0x28],rsi
0x00007ffff7a637be <+14>: mov QWORD PTR [rsp+0x30],rdx
0x00007ffff7a637c3 <+19>: mov QWORD PTR [rsp+0x38],rcx
0x00007ffff7a637c8 <+24>: mov QWORD PTR [rsp+0x40],r8
0x00007ffff7a637cd <+29>: mov QWORD PTR [rsp+0x48],r9
0x00007ffff7a637d2 <+34>: je 0x7ffff7a6380b <__printf+91>
0x00007ffff7a637d4 <+36>: movaps XMMWORD PTR [rsp+0x50],xmm0
0x00007ffff7a637d9 <+41>: movaps XMMWORD PTR [rsp+0x60],xmm1
0x00007ffff7a637de <+46>: movaps XMMWORD PTR [rsp+0x70],xmm2
0x00007ffff7a637e3 <+51>: movaps XMMWORD PTR [rsp+0x80],xmm3
0x00007ffff7a637eb <+59>: movaps XMMWORD PTR [rsp+0x90],xmm4
0x00007ffff7a637f3 <+67>: movaps XMMWORD PTR [rsp+0xa0],xmm5
0x00007ffff7a637fb <+75>: movaps XMMWORD PTR [rsp+0xb0],xmm6
0x00007ffff7a63803 <+83>: movaps XMMWORD PTR [rsp+0xc0],xmm7
0x00007ffff7a6380b <+91>: lea rax,[rsp+0xe0]
0x00007ffff7a63813 <+99>: mov rsi,rdi
0x00007ffff7a63816 <+102>: lea rdx,[rsp+0x8]
0x00007ffff7a6381b <+107>: mov QWORD PTR [rsp+0x10],rax
0x00007ffff7a63820 <+112>: lea rax,[rsp+0x20]
0x00007ffff7a63825 <+117>: mov DWORD PTR [rsp+0x8],0x8
0x00007ffff7a6382d <+125>: mov DWORD PTR [rsp+0xc],0x30
0x00007ffff7a63835 <+133>: mov QWORD PTR [rsp+0x18],rax
0x00007ffff7a6383a <+138>: mov rax,QWORD PTR [rip+0x36d70f] # 0x7ffff7dd0f50
0x00007ffff7a63841 <+145>: mov rdi,QWORD PTR [rax]
0x00007ffff7a63844 <+148>: call 0x7ffff7a5b130 <_IO_vfprintf_internal>
0x00007ffff7a63849 <+153>: add rsp,0xd8
0x00007ffff7a63850 <+160>: ret
End of assembler dump.
The two mov operations on eax (mov eax, 0x0) bother me a little as well, since I don't get they role in here (but I am more concerned with what I have just described). Thank you in advance.