2

I have a simple function in C language, in separate file string.c:

void var_init(){
    char *hello = "Hello";
}

compiled with:

gcc -ffreestanding -c string.c -o string.o

And then I use command

objdump -d string.o

to see disassemble listing. What I got is:

string.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <var_init>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 8d 05 00 00 00 00    lea    0x0(%rip),%rax        # b <var_init+0xb>
   b:   48 89 45 f8             mov    %rax,-0x8(%rbp)
   f:   90                      nop
  10:   5d                      pop    %rbp
  11:   c3                      retq

I lost in understanding this listing. The book "Writing OS from scratch" says something about old disassembly and slightly uncover the mistery, but their listing is completely different and I even not see that data interpreted as code in mine as author says.

stackoverflower
  • 545
  • 1
  • 5
  • 21

2 Answers2

3

In addition to the explaination from @VladfromMoscow, Just thought it might be helpful for the poster to see what happens when you compile to assembly rather than using objdump to see it, as the data can be seen more plainly then (IMO) and the RIP relative addressing may make a bit more sense.

gcc -S x.s

Yields

    .file   "x.c"
    .text
    .section    .rodata
.LC0:
    .string "Hello"
    .text
    .globl  var_init
    .type   var_init, @function
var_init:
.LFB0:
    pushq   %rbp
    movq    %rsp, %rbp
    leaq    .LC0(%rip), %rax
    movq    %rax, -8(%rbp)
    nop
    popq    %rbp
    ret
.LFE0:
    .size   var_init, .-var_init
    .ident  "GCC: (Alpine 8.3.0) 8.3.0"
    .section    .note.GNU-stack,"",@progbits

w08r
  • 1,639
  • 13
  • 14
  • 2
    Leaving in all the `.cfi` directives and others makes it pretty noisy; you can remove those manually, or look at asm output from https://godbolt.org/. See also [How to remove "noise" from GCC/clang assembly output?](//stackoverflow.com/q/38552116). Also, you can use `-fverbose-asm` if you want that level of detail. But yes, showing the label the RIP-relative addressing is referring to is useful. – Peter Cordes Jan 17 '20 at 06:47
  • 2
    Also note that if you compile with `-fno-pie`, it would just use a `mov`-immediate to store the absolute address to the stack, instead of a RIP-relative LEA. The Linux non-PIE "small" memory model implies addresses of static code/data fits in 32 bits, and are known at link time. (Most distros built GCC with `-fpie -pie` as the default, but Godbolt doesn't.) – Peter Cordes Jan 17 '20 at 06:48
  • 1
    Thanks @Peter! This is really useful. I've removed the `.cfi` directives as per your suggestion but left the rest, hoping you think that's ok now. – w08r Jan 17 '20 at 06:54
2

This command

lea    0x0(%rip),%rax

stores the address of the string literal in the register rax.

And this command

mov    %rax,-0x8(%rbp)

copies the address from the register rax into the allocated stack memory. The address occupies 8 bytes as it is seen from the offset in the stack -0x8.

This store only happens at all because you compiled in debug mode; it would normally be optimized away. The next thing that happens is that the local vars (in the below the stack pointer) are effectively discarded as the function tears down its stack frame and returns.

The material you're looking at probably included a sub $16, %rsp or similar to allocate space for locals below RBP, then deallocating that space later; the x86-64 System V ABI doesn't need that in leaf functions (that don't call any other functions); they can just use the read zone. (See also Where exactly is the red zone on x86-64?). Or compile with gcc -mno-red-zone, which you probably want anyway for freestanding code: Why can't kernel code use a Red Zone

Then it restores the saved value of the caller's RBP (which was earlier set up as a frame pointer; notice that space for locals was addressed relative to RBP).

pop    %rbp

and exits, effectively popping the return address into RIP

retq
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • 2
    `pop %rbp` doesn't "discard allocated memory"; the function owns the red-zone below RSP right up until it returns. The x86-64 System V ABI has a red-zone (128 bytes below RSP) so it never needs to allocate space for the pointer local var. `pop %rbp` is actually just restoring the caller's RBP (which was saved earlier, before setting up RBP as a frame pointer). If the compiler had allocated any stack space; `mov %rbp, %rsp` would be deallocating it and tearing down the stack frame. – Peter Cordes Jan 17 '20 at 06:15
  • 1
    @PeterCordes It is too complicated for the initial level of understanding.:) – Vlad from Moscow Jan 17 '20 at 06:17
  • 1
    Yes, I know the way I explained it wouldn't make a good answer for the OP. Even understanding a red-zone at all is easier if you understand the normal way. But it's still important not to make statements that imply the wrong thing. I made an edit; you might want to revert some / all of it if you think it's too much. – Peter Cordes Jan 17 '20 at 06:31
  • 2
    Notice that `lea 0x0(%rip),%rax` is not what will actually end in the final executable - that 0x0 is going to be patched by the linker with the actual address of the string literal (expressed as difference with the address where the next is going to end up in the executable). – Matteo Italia Jan 17 '20 at 06:44