2

MSDN says

Integer valued arguments in the leftmost four positions are passed in left-to-right order in RCX, RDX, R8, and R9, respectively. Space is allocated on the call stack as a shadow store for callees to save those registers. Remaining arguments get pushed on the stack in right-to-left order.

So, I'm trying to call the CreateFileW function, and this is my code:

sub rsp, 20h             ; Allocate 32 bytes because 4 registers 8 byte each
mov rcx, offset filename ; lpFileName
mov rdx, GENERIC_READ or GENERIC_WRITE ; dwDesiredAccess
mov r8, FILE_SHARE_DELETE              ; dwShareMode
xor r9, r9                             ; LpSecurityAttributes
          ;__And right-to-left order remaining arguments__
push 0 ; hTemplateFile
push FILE_ATTRIBUTE_NORMAL             ;dwFlagsAndAttributes
push CREATE_ALWAYS                     ; dwCreationDisposition
call CreateFileW

It assembles, but does not work and win64dbg causes next error:

00000057 (ERROR_INVALID_PARAMETER)

The parameters are 100% ok because it works with the Invoke macro, only the generated code is different.

mov rcx,src.403000    ;name          
mov edx,C0000000      ;GENERIC_READ or GENERIC_WRITE                  
mov r8d,4             ;FILE_SHARE_DELETE                  
xor r9d,r9d           ;0                  
mov qword ptr ss:[rbp-20],2; ;CREATE_ALWAYS           
mov qword ptr ss:[rbp-18],80 ;FILE_ATTRIBUTE_NORMAL          
and qword ptr ss:[rbp-10],0  ;0           
call qword ptr ds:[<&CreateFileW>]   

So my question is why it uses the RBP register instead of push and does not allocate 32 bytes for "shadow-store"?


Notes

Since 64-bit MASM by Microsoft no longer has an invoke directive I am using a Russian MASM64 SDK project that has an invoke macro. That project is loosely based on the MASM32 SDK.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
mantissa
  • 132
  • 7
  • 2
    Push isn't being used as the space needed to do winapi calls is allocated on the stack at the beginning of the function. It is done that way so that the stack can be more easily maintained at a 16-byte alignment. Doing push's can change that. It also appears in your code that you subtract 20h from RSP at the wrong place. That space (32 bytes of home/shadow space) should be after you put the parameters on the stack, not before. – Michael Petch Oct 29 '22 at 17:42
  • Since you are using `invoke` with 64-bit MASM, are you by chance using the MASM64 macros included at the top of your code? – Michael Petch Oct 29 '22 at 17:52
  • @MichaelPetch Yes, indeed. – mantissa Oct 29 '22 at 18:00
  • Can you tell us which MASM64 you are using (provide a link to it) There are a couple, and both have an `invoke` macro and they handle the stack differently. Can you show us your entire code? And can you show all the generated assembly code for the function. Part of your answer lies in the code generated that you don't show (there will be an allocation of a bunch of space on the stack at the top of the function) – Michael Petch Oct 29 '22 at 18:03
  • @MichaelPetch I use russian SDK [link](http://dsmhelp.narod.ru/environment.htm#id_newmacros) – mantissa Oct 29 '22 at 18:14
  • I write the raw invoke call in my code and it does not allocate ANY memory in the stack for some reason..: `WinMain proc invoke CreateFileW,offset filename,GENERIC_READ or GENERIC_WRITE,FILE_SHARE_DELETE,0,CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL WinMain endp` and x64dbg says: [screen from dbg](https://imgur.com/a/durItTF) And I have EXCEPTION_ACCESS_VIOLATION – mantissa Oct 29 '22 at 18:17
  • The reason it isn't may be related to how the Russian version works. The `PROC` of the function doing the `invoke` has to be modified with an extra parameter for that to work properly. In fact someone couldn't get their Russian code with stack parameters to work as expected. See comments here: https://stackoverflow.com/questions/73791814/createfilea-createfilew-dont-work-in-masm64-russian-variant . Is it possible you can use the MASM64 from the MASM32 forum here: http://www.masm32.com/download/masm64.zip . The non-Russian version doesn't require other code changes and just works. – Michael Petch Oct 29 '22 at 18:24
  • 1
    I'd have to look at the Russian instructions to find the change you need to the `PROC` of the function that does the `invoke`. If we could see a complete code example (edit the question with all your code) it would be easier to tell you how to resolve the problem with the stack allocation not being handled properly. – Michael Petch Oct 29 '22 at 18:25
  • 1
    I'd recommend if you want to compare your own assembly code with MASM64, that you use the non-Russian MASM64 SDK, since it doesn't require any other tweaks. Both variants need to have the function doing the `invoke` defined with a `PROC`directive so appropriate function prologue and epilogue code can be generated. If you do `invoke` from outside a function defined with `PROC` the appropriate stack adjustments won't be done. Not having the proper stack code generated may work in some cases and fail in others.The generate code of each MASM64 variant will differ. – Michael Petch Oct 29 '22 at 18:49
  • You can also look at a [similar example](https://euroassembler.eu/maclib/fastcall.htm#Invoke) of `Invoke` (Fast version) which is implemented in asm at macro-level, including the necessary stack alignment. – vitsoft Oct 29 '22 at 19:44

1 Answers1

4

If you want to push args, you have to do it before sub rsp,20h. (Which doesn't work well because normally you only want one sub rsp,20h for the whole function, not one for each call). And you'd have to count correctly to have RSP%16 == 0 after the last push. You normally don't want to change RSP except in the function prologue/epilogue in Windows x64, except for alloca type of things.

Stack args go above the shadow space, so if the function were to dump its register args to their "home space" in the shadow space, it would have a contiguous array of args. (Variadic functions like printf will actually do that; normal functions not unless it's a debug build.)

Use a debugger to look at the contents of stack memory (above RSP) right before the call for your way vs. the normal way of doing mov stores to stack arg space above the shadow space.

Note that sub rsp,20h isn't big enough to reserve space for shadow space and stack args, so the "invoke macro" code you show must be reserving more space at the start of this function.

Why it use RBP register instead of push and does not allocate 32 bytes for "shadow-store"?

It uses RBP because that's a normal way to access stack space if you've already spent instructions to set up RBP as a frame pointer.

It would be even clearer and easier to see what's going on with addressing modes relative to RSP, like [rsp+20h] to access the first slot above the shadow space, where you'd want to store the first stack arg.

That would be necessary if you've allocated a variable amount of space so the distance from RBP to just above the shadow space isn't known, but you might do it that way anyway, just for clarity and ease of getting the offsets correct. But if a compiler or clever macros can calculate correct offsets, and you've already spent the instructions to set up RBP as a frame pointer, then it's slightly more efficient to use it because it saves one byte in the machine code ([rsp+constant] requires a SIB byte to encode.)

Regular MASM doesn't have an invoke in 64-bit mode. I don't know what you're using. Maybe you need to manually reserve stack space, or maybe it does that for you at the top of the function but you left that out. Michael Petch says MASM64 comes with an invoke macro that does add a sub rsp to your code.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 2
    This doesn't quite answer the part of the question that asks about why it uses RBP and not push, and why the OP thinks that the shadow/home space isn't allocated. The answer to that is that not all the generated code for the `invoke` generated code is being shown. In fact 64-bit MASM doesn't have `invoke` and I posit that they are using the `invoke` macro from `masm64` which alters the function prologue allocating the needed space automatically for the shadow space and the parameters ahead of time. – Michael Petch Oct 29 '22 at 17:50
  • @MichaelPetch: Thanks, yeah I thought MASM got rid of `invoke` because it [no longer makes sense or has to have non-local effects](https://stackoverflow.com/questions/65340051/i-do-not-understand-either-microsoft-x64-calling-convention-or-fasms-implementa#comment115535591_65340051). Didn't realize there was some hacked up version of it. – Peter Cordes Oct 29 '22 at 17:51
  • 1
    There are a couple of offshoots of the MASM32 SDK for 64-bit and among them they include an `invoke` macro to fill in for the non-existent `invoke` directive not available in ML64.EXE. In fact there are a couple versions of said MASM64 that handle the stack differently and we recently had a question that ran into problems. Check the comments out here: https://stackoverflow.com/questions/73791814/createfilea-createfilew-dont-work-in-masm64-russian-variant – Michael Petch Oct 29 '22 at 17:56
  • Can you please explain why "sub rsp,20h isn't big enough"..Why should I alloc extra bytes? Also I notice that in my include files it is "OPTION PROLOGUE:rbpFramePrologue" and actually I have no idea what it does – mantissa Oct 29 '22 at 18:02
  • @mantissa: You need 32 bytes of shadow space, *and* some space for stack args above it. If you didn't reserve more, you'd be overwriting your own return address and shadow space with `mov qword ptr [rsp+20], CREATE_ALWAYS` and so on. – Peter Cordes Oct 29 '22 at 18:06
  • 1
    Peter, just as an aside.The `invoke` macro actually changes the function prologue and epilogue to handle allocation of the shadow space and enough space for the parameters. One version of MASM64 I think tops out the stack parameters at 8, while tthe russian version of MASM64 tries to be more sophisticated and requires modifications by the programmer to the `PROC` statement. I don't recommend the Russian version. Anyway once space is allocated on the stack ahead of time they use `mov` to put the function arguments on the stack in the space already allocated. That's why you don't see them `push` – Michael Petch Oct 29 '22 at 18:06
  • @MichaelPetch Notice: I use this Russian version invoke with 14 args in my code too – mantissa Oct 29 '22 at 18:11
  • Oh possibly the Russian version maxes out at 11 stack based arguments and 4 registers or something like that. A few weeks ago I looked quickly at their macros and was trying to go from memory. My memory is failing in old age lol – Michael Petch Oct 29 '22 at 18:14
  • Id like to use raw assembly w/o invoke macro for educational purposes! So my mistake was that I allocate shadow space before pushes and not actually enough for func stuff. Thank you big boys for great explaining Im gonna reread all this stuff tho – mantissa Oct 29 '22 at 18:28
  • Ok, I think I ready enough to say that I understand this, but still have one question: is that a some formula to calculate a amount bytes to allocate for home-space and for the stack-arguments, it is just (8 * n) where n - is amount of arguments of callee? will the stack be aligned 16 bytes so? – mantissa Oct 31 '22 at 15:28
  • 1
    @mantissa: Yes, in Windows x64, every arg takes an 8-byte slot. Larger objects are passed by reference, unlike in x86-64 System V. So it's 8 bytes per stack arg, plus the 32 bytes of shadow space. – Peter Cordes Oct 31 '22 at 19:07
  • 1
    To maintain 16-byte stack alignment before a call, the *total* amount of stack allocation (including for your local vars, and pushes of call-preserved registers like RBP, RBX, etc. that you want to use in your function) has to be an odd multiple of 8. (Odd because call itself pushes an 8-byte return address, so on function entry, RSP%16 == 8 is guaranteed, and RSP % 16 == 0 is required before a call to make that guarantee happen for the callee.) See [glibc scanf Segfaults when called from a function that doesn't align RSP](//stackoverflow.com/q/51070716) (but that's w.out shadow space) – Peter Cordes Oct 31 '22 at 19:10