0

I have been learning AT&T assembly for a few months now and I find it really difficult to wrap my head around on some of the recurring instructions in my .s file. In particular,

main:
pushq %rbp
movq %rsp, %rbp

From the book that I'm using, I came to conclude that pushq pushes the 64-bit address of the calling function to the call stack, or saves it; while movq copies the value (I suppose its the address) in the %rsp register to the %rbp register. That is, both of them contains the address of the base of the stack.

Also, other sources (thanks Govind) also explained this question pretty well: What is the purpose of the RBP register in x86_64 assembler?

I get it, I already know that pushq %rbp saves caller's frame pointer or saves address of previous stack frame, but if this is the only function I'm calling in my C program, what was the "previous stack frame" then? Like, what was stored in %rbp before my main function call?

For example, if my main function calls a function called foo(), then the asm code in my .S file would be something like this:

foo:

pushq %rbp
movq %rsp, %rbp

#whatever instruction

ret

In this case, I know what was pushed into %rbp (the address of the call instruction in main). Then it makes sense to save it because we will need to return to the main function (w/ ret). But, why do we have to do it in main if main was the only function in C?

ken_you_not
  • 93
  • 1
  • 1
  • 11
  • 3
    almost none of the programs in x86-64 have those at the begin of each functions, because omit frame pointer have been the default for years on most platforms – phuclv Apr 16 '19 at 15:19
  • 1
    Possible duplicate of [What is the purpose of the RBP register in x86\_64 assembler?](https://stackoverflow.com/questions/41912684/what-is-the-purpose-of-the-rbp-register-in-x86-64-assembler) – Govind Parmar Apr 16 '19 at 15:39
  • The answer is in the book you linked in [7.2 Program organization](https://bob.cs.sonoma.edu/IntroCompOrg-x64/bookch7.html#x22-840007.2). The second program listing contains comments expliciting what those instruction are responsable for. – Serge Ballesta Apr 16 '19 at 15:57
  • @GovindParmar Hi Govind, thank you for the swift reply. I did come across the article you provided prior to posting it, but I think its mostly due to how I asked my question. I have revised the post and please take a look. Thank you. – ken_you_not Apr 16 '19 at 17:16
  • @SergeBallesta Hey Serge, thank you for commenting, I did read that part where it describes its purpose, but my main concern is that if main was the only function that existed, why does it have to do pushq and movq? I've revised the post as well so it would help if you could take a look. Thanks a lot! :) – ken_you_not Apr 16 '19 at 17:18
  • Hey @phuclv. That doesn't quite answer my question, but I tried it out w/ gcc and it did removed the pushq and movq instructions. Thanks for the heads up though. – ken_you_not Apr 16 '19 at 17:19
  • No, `main` is almost always a true C function and it is normally called from some startup code that is normally provided in the standard C library, and that is in charge of preparing the environment for the program including the argc and argv parameters. So it can make sense (depending on the implementation) to save and restore the frame pointer register. – Serge Ballesta Apr 16 '19 at 17:51
  • `but if this is the only function I'm calling in my C program`, if you call some other function in main then main is the previous frame. And if it's main then no, you can't call main, but the OS will call your main, so the previous frame is outside of your program's knowledge – phuclv Apr 17 '19 at 01:40

1 Answers1

0

Those two lines are part of the function prolog, used to set up a new stack (or activation) frame. The first one, pushq %rbp pushes the base pointer onto the stack. The second one, movq %rsp, %rbp moves the base pointer to the stack pointer. There should be at least another line, where we subtract some value from the stack point, this has the effect of moving the stack point down.

Recall that on Intel platforms the stack grows downwards, and the base pointer (rbp) denotes the bottom of the stack, while the stack pointer (rsp) points to the top of the stack.

Now, on 64-bit machines, there are more registers so the reliance on using the stack for temporary storage is reduced, so these instructions might not be present on 64-bit code.

These instructions are part of the calling convention, or how functions are called your book should have a description of what it takes to call a function. Note, however that there are differences between 32-bit Intel and 64-bit Intel in how functions are called.

The address that we return to (the value in rip - the instruction pointer) is pushed on the stack by the caller and will not be seen in the callee code. Typically you call a function by call <fnct_name> however, this can be replaced by the sequence :

 pushq %rip
 jmp  <fnct_address>

(might have gotten the syntax wrong - I don't use AT&T syntax much).

Now, what we have done to the stack needs to be un-done in the function epilog, so we basically add some value to the stack point (i.e. move the stack pointer back to where we started), pop ebp of the stack so the base pointer returns to it was in the caller, and then we pop eip so the program knows where to resume execution.

thurizas
  • 2,473
  • 1
  • 14
  • 15
  • Hey @thurizas, thanks for asnwering. I think your answer was in the form of Intel syntax, which is fine. Small things aside, one thing that caught my attention is that when you mentioned "_used to set up a new stack (or activation) frame_", are you implying that when the program goes into the main function, the stack frame doesn't exist until we do pushq %rbp? In which, the size of the frame is determined by the offset between the %rbp and %rsp register? That is, with the function prolog, the size of the stack frame is technically zero, until we subtract a certain value to the %rsp register? – ken_you_not Apr 16 '19 at 17:29
  • correct. Remember that `main` is not really your applications starting point. `main` is called from some library function (in Linux it happens to be `_start`), that is why you will see a prolog in `main`. There will be on additional feature, typically main wants to be aligned to a 16-byte boundary (as I recall), so there will be some alignment code. – thurizas Apr 16 '19 at 18:20
  • *These instructions are part of the calling convention* No, they're something a function can do *internally* to simplify its own access to its own stack frame (and any stack args passed by its caller). The x86-64 System V ABI says that using RBP as a frame pointer is 100% optional, and `.eh_frame` metadata is required for stack unwinding so debuggers and exception handlers must never depend on function to follow the traditional EBP/RBP frame-pointer convention. (Some tools like `perf` have an option to use RBP for code compiled with `-fno-omit-frame-pointer`, though.) – Peter Cordes Apr 17 '19 at 02:15
  • *can be replaced* well not exactly. `push %rip` isn't a real instruction you can actually write. You'd need a scratch register like `r10` for `lea ret_addr(%rip), %r10` ; `push %r10` ; `jmp target` ; `ret_addr:`. But yes, what `call` does internally can be *described* as `push %rip / jmp` (where RIP during execution of an instruction = the start of the next instruction). – Peter Cordes Apr 17 '19 at 02:18