0

So having read the x64 architecture quick start guide I wrote some assembler.

https://software.intel.com/en-us/articles/introduction-to-x64-assembly

The assembler function is called from C. In turn the assembler calls C functions.

I am unsure as to how the stack mechanics work as I seem to be corrupting the stack on several occasions.

The following code demonstrates:

PUBLIC Lbra_R_A ; Op 16 - Long Branch Always
Lbra_R_A PROC
    sub rsp, 28h
    push rbx ; must preserve rbx
    ; Calc destination branch address by Adding the two Bytes at [PC+1] (high byte) and [PC+2] (low byte) with PC+2 reg
    ; Get first byte high byte
    movzx rcx, word ptr [pc_s]
    mov rbx, rcx ; save pc_s into temp
    inc bx ; inc temp pc_s
    call MemRead8_s ; returns byte in ax (al)
    push ax ; save high byte
    ; Get second byte low byte @ pc_s
    mov rcx, rbx
    inc bx ; inc temp pc_s
    call MemRead8_s ; returns byte in ax (al) - this call destroys saved high byte???
    ; combine low and high bytes to make 16 bit 2 complements offset
    pop dx ; get saved high byte - wrong value
    mov ah, dl ; move the high byte to high position ; ax now contains 16 bit offset
    add bx, ax ; bx now contains pc_s(+2) + offset
    mov word ptr [pc_s], bx
    pop rbx ; must restore rbx - wrong value???
    add rsp, 28h
    ret
Lbra_R_A ENDP

I setup the stack with sub rsp, 28h but I'm not sure why and I have no idea what I am allowed to do in that 28h byte area!!! Is it for me or is it reserved. However without this my code doesn't even run!!!

Next I preserve the rbx register because it is considered non volatile. However at the end when I restore rbx it isn't the same as what I saved???

Mid code I save/push the ax register before calling a C function called MemRead8_s (supplied by me). However the moment I call that function the value of ax now stored on the stack is over written so when i attempt to restore it a few instruction later it is wrong!!! The rsp value before and after this call seems to be the same so what did calling this function do to the stack?

Can anyone shed some light on what the correct stack setup protocol is and possibly explain why my stack saves are getting corrupted?

Walter
  • 173
  • 1
  • 13
  • Your function owns the stack space below the initial value of RSP (on function entry) and above the *current* value of RSP. So `sub rsp, 28h` allocates 0x28 bytes of stack space (and aligns the stack by 16, which you break with a 16-bit push. Don't use 16-bit push/pop; save/restore the full 64-bit register. `MemRead8_s` is allowed to assume that RSP was 16-byte aligned before the `call` that pushed a return address for it.) – Peter Cordes Jun 01 '18 at 04:19
  • 16 bit or byte? I thought push AX was a 16 bit push!!! And why is my push rbx wrong? – Walter Jun 01 '18 at 04:27
  • Yes, `push ax` pushes 2 bytes, breaking the 16-*byte* stack alignment which `sub rsp, 28h` created. – Peter Cordes Jun 01 '18 at 04:32
  • So if I want to preserve the rbx reg and ax can/should I store them in the 28h byte area that I reserved? Is this area considered safe for me to use? – Walter Jun 01 '18 at 04:34
  • Yup, that would be a good plan. Or `push rbx` at the start of your function (before `sub rsp, 20h`), and `pop` at the end would also be efficient. You don't need to save/restore `ax`; your function is allowed to clobber that register. Look up the calling convention you're using to find out which regs are call-preserved vs. call-clobbered. And really you don't need to be using 16-bit registers. Use 32-bit eax in 32 or 64-bit code. See https://stackoverflow.com/tags/x86/info for links to docs. – Peter Cordes Jun 01 '18 at 04:39
  • I save AX because the MemRead8 C func clobbers it too! – Walter Jun 01 '18 at 04:42
  • Oh I see. What a C compiler would do is save another call-preserved reg besides rbx, and use it. I'd highly recommend looking at compiler output for a C version of your code. See [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116). Or in this case you could `add` the low byte from the first MemRead to `ebx` before calling the 2nd. So `call` / `mov rcx, rbx` / `inc ebx` / `add ebx, eax` / `call` / `shl eax,8` / `add ebx, eax` to get the same result in `bx` that your current function does, but more efficiently. (You can use LEA to do `rbx+rax+1`) – Peter Cordes Jun 01 '18 at 04:57

1 Answers1

1

Your function owns the stack space below the initial value of RSP (on function entry) and above the current value of RSP.

On Windows, your function also own 32 bytes above the return address (the shadow space). Make sure to reserve this space before calling another function, unless it's a private helper function that doesn't use the shadow space.

On Linux and other non-Windows, the x86-64 System V ABI says your function owns 128 bytes below the current RSP (the red zone). Obviously push or function calls will step on that space, so it's mostly useful in leaf functions to avoid the sub/add of RSP. But it is safe from being clobbered by signal handlers.


So sub rsp, 28h allocates 0x28 bytes of stack space (and aligns the stack by 16 bytes, because it was 16-byte aligned before call in your caller pushed a return address. (You break that alignment with a 16-bit (2-byte) push, this is bad). ret is basically pop rip, which is why you have to restoring the stack to its original value before you can run ret.

Don't use 16-bit push/pop; save/restore the full 64-bit register. MemRead8_s is allowed to assume that RSP was 16-byte aligned before the call that pushed a return address for it, you got lucky that it happens not to rely on stack alignment.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847