4

Here's my source code:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAX 500

int main(int argc, char** argv)
{
        if (argc != 2)
                exit(1);
        char str[MAX];
        strcpy(str, argv[1]);
        return 0;
}

I disassembled main using gdb and got the following result:

Dump of assembler code for function main:
   0x0000000000001145 <+0>:     push   %rbp
   0x0000000000001146 <+1>:     mov    %rsp,%rbp
   0x0000000000001149 <+4>:     sub    $0x210,%rsp
   .
   .
   .
End of assembler dump.

Here the notable thing is:

0x0000000000001149 <+4>: sub $0x210,%rsp

and my question is-
Why is there $0x210 (528 bytes), when it should be $0x1f4 (500 bytes) as I asked for?

Boann
  • 48,794
  • 16
  • 117
  • 146
Shubham
  • 1,153
  • 8
  • 20
  • 2
    What compiler, and what optimization options? – Nate Eldredge Nov 06 '21 at 15:50
  • @NateEldredge It's ```gcc (Debian 10.2.1-6) 10.2.1 20210110``` with default settings. – Shubham Nov 06 '21 at 16:48
  • 1
    Try with `#define MAX 100` and notice if `sub $0xXXX,%rsp` is 400 lower or not. This provides insight concerning padding and how much _base_ memory was needed even if `MAX` was theoretically zero. – chux - Reinstate Monica Nov 06 '21 at 17:00
  • @chux-ReinstateMonica Yeah, lately I was thinking about the padding (slack bytes). BTW, output's: ```0x0000000000001149 <+4>: add $0xffffffffffffff80,%rsp``` – Shubham Nov 06 '21 at 17:08
  • @NateEldredge defining ```MAX 0``` gives me this assembly code: ```0x0000000000001149 <+4>: sub $0x10,%rsp```, i.e. 16 bytes of buffer. As @chux-ReinstateMonica said. – Shubham Nov 06 '21 at 17:28

1 Answers1

5

I am guessing you are using gcc and compiling without optimizations, like this (godbolt).

There are a couple things going on here:

First, when compiling without optimizations, the compiler tries to ensure that every local variable has an address in memory, so that it can easily be inspected or modified by a debugger. This includes function parameters, which on x86-64 are otherwise passed in registers. So the compiler needs to allocate additional stack space where the argc and argv parameters can be "spilled". You can see the spilling at lines 5 and 6 of the assembly:

        movl    %edi, -516(%rbp)
        movq    %rsi, -528(%rbp)

If you look carefully, you may note that the compiler wasted 4 bytes by placing argc (from %edi) at address -516(%rbp) when -520(%rbp) was otherwise available. It's not entirely clear why, but after all, it's not optimizing! So that gets us to 516 bytes.

The other issue is that the x86-64 ABI requires 16-byte stack alignment; see Why does the x86-64 / AMD64 System V ABI mandate a 16 byte stack alignment?. In this case, to make a long story short, it implies that our stack adjustment needs to be a multiple of 16 bytes. (The return address and pushed rbp add a further 16 bytes which doesn't disturb this alignment.) So our 516 must be rounded up to the next multiple of 16, which is 528.

If the compiler had been more careful and not wasted that 4 bytes in between argc and argv, we could have got away with only 512 bytes. One benefit of using 528, though, is that the buffer str ends up 16-byte aligned. This isn't required for an array of char, whose minimum alignment is just 1, but it can make it more efficient for string functions like strcpy to use fast SIMD algorithms. I am not sure if the compiler is doing this deliberately or if it's just a coincidence.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • Okay, tell me if I got it correct: **there's 16-byte stack alignment, so when asked for 500 ```char```s, I got 512 bytes of memory, then ```argc``` got pushed and memory got increased by ```(512+16 = 528)```.** Correct? – Shubham Nov 06 '21 at 17:44
  • 1
    @Shubham: Argc and argv weren't literally `push`ed, they were stored with `mov` into the space reserved with `sub`. So it's more like 500+8+4 = 512, and then GCC wasted an extra 16 bytes. (See [Why does GCC allocate more space than necessary on the stack, beyond what's needed for alignment?](https://stackoverflow.com/q/63009070)) There would have been room for the array plus spilling the stack args, with RSP aligned by 16, and the array starting at the bottom of that, thus also aligned. Of course, only a debug build would actually store main's incoming register args to memory for this. – Peter Cordes Nov 06 '21 at 18:07
  • @PeterCordes Thanks for you cmnt. I think I need to dive deeper in assembly to totally understand what's going on. But at least this answer and all these comments are making some sense to me. – Shubham Nov 06 '21 at 18:14