0

(Sorry for my bad english, because i am from South Korea)

i tried this code

lea rcx, QWORD PTR [message]
call [print_message] ;it covered return address to bad address

xor rcx, rcx
ret

and crashed...
after that happen, i tried another way

sub rsp, 8   ;shadow stack

lea rcx, QWORD PTR [message]
call [print_message]

add rsp, 8
ret

; stack frame
push rbp
mov rbp, rsp


lea rcx, QWORD PTR [message]
call [print_message]

mov rsp, rbp
pop rbp
ret

these 2 codes is working, but the problem is..., why procedure need these thing?
this makes me curious

real code that the problem came from

extern __imp_MessageBoxA : QWORD

.data

message db "1234", 0

.code

entry_point proc

sub rsp, 8

xor ecx, ecx
lea rdx, QWORD PTR [message]
lea r8, QWORD PTR [message]
mov r9, 0
call [__imp_MessageBoxA] ;stdcall

add rsp, 8
ret

entry_point endp
end
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Guy Cool
  • 3
  • 3

1 Answers1

0

Unfortunately, I don't have much experience with 64-bit code, so I don't know about the exact details:

Actually, you don't need a shadow stack or a stack frame. But some 64-bit functions require rsp to be 16-byte-aligned.

This means that the value of rsp must be a multiple of 16 when a function is called.

If your function looks like this:

myFunction:
    lea rcx, QWORD PTR [message]
    call [print_message] ;it covered return address to bad address
    ...

... then rsp is a multiple of 16 before the instruction call myFunction. And call myFunction pushes 8 bytes to the stack, so rsp is no longer a multiple of 16 (but the value of rsp can be written as 16*n+8).

When you perform call [print_message], rsp is not a multiple of 16 and the program crashes if the function print_message requires rsp to be 16-byte-aligned.

The instructions sub rsp, 8 and push rbp will subtract 8 from the rsp so the value of rsp is a multiple of 16 again.

The background are certain CPU instructions that require an address that is a multiple of 16 as argument. Example:

print_message:
    sub rsp, 24
      ; The next instruction will crash if rsp is not
      ; a multiple of 16. This is the case if rsp was
      ; not a multiple of 16 before the
      ; "call print_message" instruction
    paddd xmm0, [rsp]
Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38
  • but why 16-byte-aligned? – Guy Cool Jan 03 '22 at 10:49
  • @GuyCool Unfortunately, I don't know much about 64-bit code. However, the corresponding instructions seem to access 128-bit (16-byte) values in memory... There are many CPUs that don't allow 32-bit memory accesses to addresses that are not divisible by 4. On x86 CPUs you can switch this on as optional feature - it will speed up the CPU then. – Martin Rosenau Jan 03 '22 at 10:57
  • okay, many thank you for quick and kind response! – Guy Cool Jan 03 '22 at 10:59
  • @GuyCool: [Why does the x86-64 / AMD64 System V ABI mandate a 16 byte stack alignment?](https://stackoverflow.com/q/49391001). For example [glibc scanf Segmentation faults when called from a function that doesn't align RSP](https://stackoverflow.com/q/51070716) is an example of a case where the compiler really did use `movaps` for copying locals. – Peter Cordes Jan 03 '22 at 19:12
  • @GuyCool: But since you're using the Windows x64 calling convention, it's not just alignment that's necessary; the callee also owns 32 bytes of *shadow space* above its return address. (Not a shadow *stack*, that's means something different). This answer doesn't cover that correctly; in this case your callee maybe only needed alignment, (or only uses 8 bytes of its shadow space?), so `sub rsp, 8` happens to work, but it's not actually safe in general. – Peter Cordes Jan 03 '22 at 19:13