1

I'm a little confused for how to pass things to the stack when calling a function, I have the following assembly:

.data
str:
    .asciz "%d %d %d %d %d %d %d %d %d %d\n"

.text
__entry:
    pushq %rbp
    movq %rsp, %rbp
    leaq str(%rip), %rdi
    movq $1, %rsi
    movq $2, %rdx
    movq $3, %rcx
    movq $4, %r8
    movq $5, %r9
    movq $6, -4(%rbp)
    movq $8, -8(%rbp)
    movq $9, -12(%rbp)
    movq $10, -16(%rbp)
    call _printf
    popq %rbp
    ret

.global _main
_main:
    pushq %rbp
    movq %rsp, %rbp
    call __entry
    popq %rbp
    ret

But when I run the program I get a load of junk values. The values passed in the registers are fine, but the ones passed in the stack are not. I checked the call convention and it says that "additional arguments are passed on the stack.", and that they should be "aligned to a 16 byte boundary".

Two questions:

  • What am I doing wrong here w/r/t passing my values to the stack?

and:

  • What does it mean by "aligned to a 16 byte boundary"?
flooblebit
  • 477
  • 1
  • 3
  • 9

1 Answers1

1

Your core problem is that you neglected to allocate space on the stack by decreasing the stack pointer before loading all the arguments on the stack. Naturally, printf immediately trashes whatever is in the area below the stack pointer, causing the garbage printout.

You also need to fix the stack offsets: movq moves eight bytes at a time, so each slot is eight bytes large.

Your fixed code looks like this:

    pushq %rbp
    movq %rsp, %rbp
    sub $32,%rsp           # new
    leaq str(%rip), %rdi
    movq $1, %rsi
    movq $2, %rdx
    movq $3, %rcx
    movq $4, %r8
    movq $5, %r9
    movq $6, -8(%rbp)      # fixed: offsets are multiples of 8, not 4
    movq $8, -16(%rbp)     # fixed
    movq $9, -24(%rbp)     # fixed
    movq $10, -32(%rbp)    # fixed
    xor  %eax,%eax         # new: %al=0 FP register args
    call _printf
    add $32,%rsp           # new
    popq %rbp
    ret

And everything should once again work. Typically, you use push to load arguments via the stack. For example, your function call would look like this:

    pushq %rbp
    movq %rsp, %rbp
    leaq str(%rip), %rdi
    movq $1, %rsi
    movq $2, %rdx
    movq $3, %rcx
    movq $4, %r8
    movq $5, %r9
    push $6                # push rightmost argument
    push $8                # push second-to-last argument
    push $9                # ...
    push $10
    xor  %eax,%eax         # tell printf to expect 0 floating point args
    call _printf
    add $32,%rsp           # pop arguments off the stack
    popq %rbp
    ret
fuz
  • 88,405
  • 25
  • 200
  • 352
  • Oh wow, why 16? Is this what it means by aligning to a 16 byte boundary? – flooblebit Mar 25 '18 at 15:13
  • @flooblebit Every datum below `rsp` is trashed by function calls. I decrease `rsp` by 16 because you push arguments all the way down to `rbp - 16`. This has nothing to do with alignment, but rather with how arguments passed through the stack work. – fuz Mar 25 '18 at 15:15
  • What you're saying makes sense but I literally copy/pasted and it still seems to give me garbage-y values – flooblebit Mar 25 '18 at 15:18
  • 1
    Oh I see, it's because they should be -8, -16, etc. since its 8 byte and not 4 bytes values + allocating the stack space like you said – flooblebit Mar 25 '18 at 15:19
  • @flooblebit Oh yes, indeed! – fuz Mar 25 '18 at 15:21
  • @flooblebit As a last note, you'd usually use `push` to put the arguments for a function on the stack. – fuz Mar 25 '18 at 15:22
  • oh do you think you could edit and show me an example of using push as well? :) – flooblebit Mar 25 '18 at 15:23
  • 1
    @fuz: while you can use push instructions, usually you don't as its slower. Subtracting a constant from esp and then using offsets from esp or ebp is faster if there is more than one stack argument or more than one call. – Chris Dodd Mar 25 '18 at 17:23
  • 1
    @ChrisDodd That's not necessarily correct as modern x86 CPUs have a stack engine to optimize push and pop instructions. As another piece of evidence, why would gcc and clang prefer `push` for function arguments if it wasn't actually faster? – fuz Mar 25 '18 at 17:38
  • @ChrisDodd: `push imm8` is 2 bytes, and 1 fused-domain uop, on modern CPUs, same as a regular store but more compact. The dependency chain through the stack pointer is handled with zero latency by the stack engine, and no uops are needed to update the stack pointer register value (except an occasional stack-sync uop on Intel CPUs, every time you use `rsp` explicitly after stack insns: push/pop/call/ret). https://stackoverflow.com/questions/36631576/what-is-the-stack-engine-in-the-sandybridge-microarchitecture. But yeah, after one call, you might leave RSP where it is and use `mov`. – Peter Cordes Mar 25 '18 at 21:21
  • 1
    Also, while `add $32, %rsp` is a cleaner example, in reality gcc would use `leave` if it made a stack frame at all. Agner Fog lists it as 3 uops on Intel. Probably his measurements include a stack-sync uop built-in from using it back-to-back, so it runs like `mov %rbp, %rsp` / `pop %rbp` would. (Which saves a byte vs. `add imm8`). Hmm, `add` avoids a dependency on `%rbp`, which `printf` might have just popped itself, so maybe it can be better in rare cases. Anyway, `leave` is useful but one of the special-case CISC instructions beginners can ignore (until they see it in compiler output). – Peter Cordes Mar 25 '18 at 21:29
  • @PeterCordes I have removed your comment about the stack frame being optional: Removing the stack frame code breaks the stack alignment, necessitating extra code to restore it. The stack frame code can't just be removed without replacement. – fuz Mar 25 '18 at 21:43
  • Ah yeah, that's fair. It's pointless in that function, but you do need something to realign the stack, so it would clutter things to much to mention it. (In practice glibc `printf` with no FP args in registers doesn't fault if the stack is only 8B-aligned, but obviously don't depend on that.) – Peter Cordes Mar 25 '18 at 21:47
  • 1
    @PeterCordes This whole stack alignment thing is super annoying when trying to teach beginners assembly language. You can either not use any libraries at all (making the beginning quite frustrating), supply your own library (making the environment people program in artificial and not reflective of reality) or teach them alignment earlier than sensible from a pedagogic point of view. – fuz Mar 25 '18 at 21:49
  • @PeterCordes But then you have to teach them how linking works and they have to know C to get anything done at all. Also, it's quite difficult to make convincing exercises when the you can't have any interactivity at all from within their assembly code. I'm not convinced this is a good way to teach assembly. – fuz Mar 25 '18 at 22:01
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/167524/discussion-between-peter-cordes-and-fuz). – Peter Cordes Mar 25 '18 at 22:05