2

I’m working on practice problem 3.34 in Computer Systems a Programmers Perspective and trying to understand exactly what is happening. The question states "Consider a function P, which generates local values, named a0-a7. It then calls function Qusing these generated values as arguments. GCC produces the following code for the first part of P". We are given the following assembly code:

/* long P(long x)
 * x in %rdi */
P:
  pushq   %r15
  pushq   %r14
  pushq   %r13
  pushq   %r12
  pushq   %rbp
  pushq   %rbx
  subq    $24, %rsp
  leaq    1(%rdi), %r15
  leaq    2(%rdi), %r14
  leaq    3(%rdi), %r13
  leaq    4(%rdi), %r12
  leaq    5(%rdi), %rbp
  leaq    6(%rdi), %rax
  movq    %rax, (%rsp)
  leaq    7(%rdi), %rdx
  movq    %rdx, 8(%rsp)
  movl    $0, %eax
  call    Q

So far, this is what I understand: The instructions pushq %r15through pushq %rbx Are being pushed to the stack so as to preserve those values, and eventually replace them in their respective registers when procedure Preturns (Since they are callee saved registers). I see that the instruction subq $24, %rspallocates 24 bytes of space on the stack.

I have two questions though:

  1. What are the load effective address lines doing? It seems to me that it is taking the memory location that is is addressed by long x and storing that new memory address (after adding 1 or 2 or ... 7) in the various callee saved registers. Is this correct? I'm a bit confused as to the value that they store? Is there any significance to it? Also, what will function Qdo with these registers? How does it know to use them as arguments, since they don't seem to be the argument registers? Only long xis passed on as an argument (as it is in register %rdi.
  2. What is the contents of the Stack? I see that 24 bytes were allocated, but I can't seem to account for all of that space :( I understand the stack to look like this:
???????????????????????????????????:16
The result of 7(%rdi) (argument a7):8
The result of 6(%rdi) (argument a6):0 <--- %rsp

I cant seem to account for what is contained in bytes 16-23 :(

Thank you soo much in advance, I'm really struggling with this one.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Hi Nate, no, this is from the textbook (third edition). – Leonard Mohr Oct 18 '21 at 01:14
  • However the extra 8 bytes on the stack would be consistent with [16-byte stack alignment](https://stackoverflow.com/questions/49391001/why-does-the-x86-64-amd64-system-v-abi-mandate-a-16-byte-stack-alignment). – Nate Eldredge Oct 18 '21 at 01:21
  • 2
    Oh, it is an (acknowledged) error in the (non-global) book: https://csapp.cs.cmu.edu/3e/errata.html. `Q` is supposed to take no arguments. That would make more sense. – Nate Eldredge Oct 18 '21 at 01:23
  • 1
    And the `lea` instructions are just fancy ways of adding constants to `rdi` and storing the result in a new register. So it's something like `a0 = x+1; a1 = x+2; a2 = x+3; ...` – Nate Eldredge Oct 18 '21 at 01:25
  • Ahh, 16-byte stack alignment makes sense!!! I didn’t notice that, but yes, xorl would be more efficient, so more likely for that to be used by GCC. Yes, my thought was that maybe somehow the arguments were being passed in rdi somehow. Like maybe all 8 arguments are of type char and stored in x. And the leaqs somehow communicate that. – Leonard Mohr Oct 18 '21 at 01:31
  • Ahh okay!! Just read your comment. Okay, yes that makes sense!!! Thank you so much! :) – Leonard Mohr Oct 18 '21 at 01:33
  • The question seems inconsistent with the code -- there are only 7 local values created in P prior to calling Q (not 8) and *none* of them are passed as arguments to Q . The only possible argument to Q is x (passed in %rdi); any second argument would be in %rsi which is not touched. – Chris Dodd Oct 18 '21 at 03:16
  • @ChrisDodd: See my comment above. There is a mistake in the problem statement, and `Q` is not supposed to take any arguments. – Nate Eldredge Oct 18 '21 at 04:41

1 Answers1

3

First, note that there is an erratum for this practice problem. The local variables are not passed as arguments to Q; rather Q is being called with no arguments. So that explains why the variables aren't in the argument-passing registers.

(The strange zeroing of eax may be explained by Differences in the initialization of the EAX register when calling a function in C and C++ ; they might have accidentally declared void Q(); instead of void Q(void);. I'm not sure why the compiler emitted movl $0, %eax instead of the more efficient xorl %eax, %eax; it looks like optimizations are on, and that's a very basic optimization.)

Now as for lea, it's really just an arithmetic instruction, and compilers tend to use it that way. See What's the purpose of the LEA instruction?. So leaq 1(%rdi), %r15 simply adds 1 to the value in rdi and writes the result to r15. Whether the value in rdi represented an address or a number or something else is irrelevant to the machine. Since rdi contained the argument x, this is effectively doing

a0 = x + 1;
a1 = x + 2;
a2 = x + 3;
// ...

The alternative would be something like movq %rdi, %r15 ; addq $1, %r15 which is more instructions.

Of course, these values are being put in callee-saved registers (or memory, for the last two) so that they are not destroyed by the call to Q().

As for the stack, the x86-64 ABI requires 16-byte stack alignment. The stack pointer was a multiple of 16 before the call to P, and it must again be a multiple of 16 when we call Q. Our caller's call P pushed 8 bytes to the stack, and the various register pushes in the prologue push 48 bytes more. So in order to end up with a multiple of 16, we must adjust the stack pointer by 8 more than a multiple of 16 (i.e. an odd multiple of 8). We need 16 bytes for local variables, so we must adjust the stack pointer by 24. That leaves 8 bytes of stack that just won't be used for anything, which is your ?????? at 16(%rsp).

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • 2
    `movl $0, %eax` is probably from `gcc -O1` to optimize some without inlining. Only `-O2` and higher include `-fpeephole2` which IIRC is where GCC looks for use-cases for zeroing idioms. – Peter Cordes Oct 18 '21 at 05:15
  • 1
    Another canonical for LEA's usage outside of pointer math is [Using LEA on values that aren't addresses / pointers?](https://stackoverflow.com/a/46597375). I dislike the implication in many of the answers on [What's the purpose of the LEA instruction?](https://stackoverflow.com/q/1658294) that using it for arbitrary copy-and-add is some kind of "trick" or not its primary purpose. Whether or not that's true historically, IMO it's useful to simply think of it as a shift-and-add instruction that happens to use memory-operand machine code formats (and asm syntax). – Peter Cordes Oct 18 '21 at 05:21