9

On x64, does each PUSH instruction push a multiple of 8 bytes? If not, how much does it push?

Also, how much stack space does each function parameter consume?

IAbstract
  • 19,551
  • 15
  • 98
  • 146
Demi
  • 3,535
  • 5
  • 29
  • 45
  • 1
    The Intel software developer manual has information on all instructions for various modes (e.g. 32 bit vs. 64 bit). – Frank C. Oct 28 '16 at 14:13
  • The 64-bit calling conventions pass the first 4 or 6 args in registers. See the [x86 tag wiki](https://stackoverflow.com/tags/x86/info) for links to ABI / calling-convention docs, and note especially that Window still reserves stack space for register args (shadow space), so be sure to read carefully; the rules are non-trivial and have special cases. – Peter Cordes Mar 25 '18 at 06:52

2 Answers2

15

PUSH Operand Size in 64-bit mode

The size of the value pushed on the stack and the amount that the stack pointer is adjusted by depends on the operand size of the PUSH instruction. In 64-bit mode the operand size can only be 16-bit or 64-bit. It's not possible to encode a 32-bit PUSH instruction in 64-bit mode and it's not possible to encode an 8-bit PUSH instruction in any mode.

For example, these are all 64-bit PUSH instructions:

push    rax
push    1              ; 8-bit immediate sign-extended to 64 bits
push    65536          ; 32-bit immediate sign-extended to 64 bits
push    QWORD PTR[0]
push    fs             ; 16-bit segment register zero-extended to 64 bits

The above instructions all subtract 8 from RSP and then write a 64-bit value to the location pointed to by RSP.

These are all 16-bit PUSH instructions:

push    ax
push    WORD PTR[0]

These instructions subtract 2 from RSP and then write a 16-bit value to the location pointed by RSP. Because they badly misalign the stack, using a 16-bit PUSH in 64-bit mode is pretty much always a mistake. Instead you should load the 16-bit value into a register (if not already there), extend it as necessary, and then use a 64-bit PUSH.

The following instructions are illegal and can't be encoded in 64-bit mode:

push    al
push    eax
push    BYTE PTR[0]
push    DWORD PTR[0]
push    0100000000h    ; 64-bit immediate value isn't supported

Pushing an 8-bit or 32-bit value on the stack requires loading the value into a register, extending it and then using a 64-bit PUSH, just like you should do with 16-bit values.

Parameter Passing in 64-bit mode

Generally speaking, in 64-bit mode function arguments aren't passed on stack. Both the Microsoft and Linux 64-bit x86 calling conventions pass most arguments in registers. The stack is only used when there's not enough room in registers to pass the arguments to a function. In that case each argument takes up one or more 8 byte stack slots. Note that compilers won't necessarily use PUSH instructions to place these arguments onto the stack. A common strategy is to allocate enough space on the stack for all of a function's outgoing arguments in the function prologue and then use MOV instructions to put arguments on the stack as necessary.

Ross Ridge
  • 38,414
  • 7
  • 81
  • 112
1

No, but in practice, one always pushes an 8 byte value onto the stack.

Function parameters consuming varying amounts of stack space depending on the size of the function parameter and whether it is passed in the stack, in the registers, or passed by reference.

If one passes a function parameter in the stack by pushing, then the fact that there are convenient push instructions that pushes 8 bytes strongly suggests that you pass the parameter as an 8 byte value. For pointers, int64 and plain doubles, this is obviously easy. For char, bool, short, and other types whose memory size is smaller, what most compilers do is push the value in an 8 byte chunk. Types that take 16 or 32 bytes might be pushed by the compiler with several push instructions. Bigger values tend not to get passed by pushing; often a compiler tries to pass a pointer to a bigger value rather than pass the value itself. {I've built a compiler that can pass arbitrarily big values, but it does so by making space in the stack, and then executing a block move instruction]. Details vary from compiler to compiler, and according to the language semantics of the program being compiled.

A really clever compiler might notice that several arguments are small and can be packed into an 8 byte quantity that only requires a single push. I've not seen one actually do that, probably because it takes work to pack such values together into a register, and push instructions are already pretty fast by design and by cache.

It is possible to push smaller values onto the stack. This is legal according to the architecture, but is likely to cause a mis-aligned access performance hit if the set of small values pushed isn't a multiple of 8 bytes. And then one must be careful to pop off the non-multiple correctly to restore stack alignment. Not useful in my experience (see code golf comment by Peter Cordes).

If you pass the value in a register, nothing gets pushed :-}

One might arrange to store parameter values in a well known locations in the stack. Then there isn't any push :-}

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • 1
    Answer seems wrong - the size of the `push` depends on the size of the pushed register, as Ross explains below. Furthermore the details of how the compiler deals with the stack also seem lacking: often the compile will simply manipulate the stack pointer once (e.g., `sub esp, 16`) to make space for all the parameters, and then write to the space directly with `mov`. This is typically faster than several `push` instructions, and allows you to write any size of arguments. In any case, it's not up to the compiler - the size of various arguments on the stack is set by the ABI. – BeeOnRope Nov 03 '16 at 17:06
  • So, yes, you can stupidly use PUSH AX. You x86-64 program will then very likely not work, so in practice nobody does that. OP asks what pushes do on x86-64; I provided the only practical answer. I also addressed storing parameters into known locations in the stack; they only way they get that way is because the compiler arranges it; I didn't think I had to say that. – Ira Baxter Nov 04 '16 at 06:11
  • A use-case for 16-bit `push`: merging two 16-bit values into a 32-bit, when optimizing for code-size without caring about performance: [adler32 code-golf in 32-bytes of x86-64 machine code](https://codegolf.stackexchange.com/questions/78896/compute-the-adler-32-checksum/78972#78972). push64 / push16 / pop64, then pop16 to balance the stack, is only 6 bytes. In 32-bit mode: push16/push16 / pop32 is only 5 bytes. – Peter Cordes Mar 25 '18 at 05:59
  • Yep, you can find a use for 16 bit pushes if you are doing code golf and don't care about performance. I've been coding x86 for some 20+ years and never found this to be useful in practice. YMMV. – Ira Baxter Mar 25 '18 at 08:35
  • Totally agreed. That code-golf adler32 is definitely in the realm of "silly computer tricks", not something that's practically useful in real life (i.e. part of a good solution to any real problem). If you care *that* much about code-size over performance, just use a microcontroller instead of an x86-64 chip! But some people do enjoy silly computer tricks; there's a ["demo scene"](https://en.wikipedia.org/wiki/Demoscene) of people who write tiny executables that do cool graphics and sound in 512B or 4k or whatever. Extreme code-size optimization may possibly be useful in an exploit payload? – Peter Cordes Mar 25 '18 at 09:02