4

When we run a code, the compiler after compile "detects" the necessary amount of Stack memory? And with this, each program has its own "block" of stack memory.

Or the stack memory of each program is defined by the OS?

Who defines the amount of stack memory for each application running?

Or we don't have this and each program can use all stack memory if it wants?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • If the stack is continuous data, and we have tons of applications running at the same time we should have a way to create a "gap" of memory between apps – Vitor de oliveira Oct 18 '21 at 23:53
  • 7
    Each program has its own virtual memory space, which includes code, data, and stack. As far as the program knows, it has all the memory to itself. The stack size is typically a fixed size determined by the OS. – user3386109 Oct 18 '21 at 23:57
  • 2
    The compiler doesn't do anything *after compile*, because its job is finished when the code has been compiled. The compiler just converts text code into object files. The linker takes over at that point. The minimum stack size is allocated by the linker at link time, and its usually a standard amount based on the OS. – Ken White Oct 19 '21 at 00:23
  • Normal C compilers like GCC, clang, MSVC, etc. don't try to statically analyze (during compile) how much stack space a program might require. It's up to you not to use more than the OS gives you, e.g. don't put huge VLAs in C automatic storage (i.e. on the stack as local vars). – Peter Cordes Oct 19 '21 at 03:14
  • The stack size of the thread created by the process loader is typically a fixed size determined by a field in the executable header. When the app was built, the linker set that field to whatever was specified in linker argument/s or to a default value. – Martin James Oct 19 '21 at 03:21
  • Of course, a process may later create more threads, further increasing the overall stack space. – Martin James Oct 19 '21 at 03:37
  • Embedded systems have their own way to define the stack's location and size. There is no general rule. – the busybee Oct 19 '21 at 09:06

1 Answers1

4

On x86-64 Linux, the stack is given 8MB by default. Browse Ciro Santilli's answer about the memory layout of x86 Linux here: Where is the stack memory allocated from for a Linux process?.

For example, you could have something like the following:

Content                       Virtual address
_______________________________________________________________________

----------------------------- 0xFFFF_FFFF_FFFF_FFFF
Kernel
----------------------------- 0xFFFF_8000_0000_0000
Unavailable due to the canonical address requirement (PML4 or PML5 determines size of hole; smaller with 5 level paging)
----------------------------- 0x0000_8000_0000_0000
Stack grows downward from the top here
v v v v v v v v v
Maximum stack size is here
----------------------------
Process
----------------------------- 0x400000

For the unavailable section see Peter Cordes's answer here: Why does QEMU return the wrong addresses when filling the higher half of the PML4?.

In itself, the loader doesn't have to read the executable for the stack size. The stack size is not commonly stored in an ELF file. The OS simply assumes a default stack size is enough for most programs.

You seem to misunderstand what it means to allocate stack space. The stack is allocated during compilation. It is allocated by simple means of subtracting RSP of the space required for the function. When a process enters a function (including main) it will:

  1. Push RBP on the stack;

  2. Put RSP in RBP;

  3. Subtract RSP of the allocated stack space for the function.

Step 3 clears the way for the function to work within its allocated stack space. After those 3 steps, the stack is accessed by using a relative negative offset from RBP. I have a recently deleted answer which specifically corresponds to the question so I'll copy its text here:

The local variables are allocated on the stack. Memory is allocated for variables/objects you initialize with new at runtime using a system call. Local variables are accessed using a negative relative offset from RBP and global variables are accessed using a relative offset from RIP (by default).

I had to study a bit of how that works because I've been in the process of writing an x86-64 OS and I had to understand this stuff in order to continue my development.

Now it is quite confusing for a beginner so let's look at a concrete example of what this means. Create a main.cpp file and place the following into it:

int global_variable = 3;

void func(){
    int local_variable = 10;
    global_variable = 10;
    local_variable++;
}

int main(){
    int local_variable = 4;
    global_variable = 5;
    local_variable += 4;
    func();
    return 0;
}

Compile with the following:

g++ --entry main -static -ffreestanding -nostdlib main.cpp -omain.elf

Here we set the entry to be the main function with --entry main we ask the code to be all included in the executable with -static and we ask to remove the standard library from the code with -nostdlib. This is to simplify the output of objdump -d main.elf (disassembly of the executable) which is the following:

user@user-System-Product-Name:~$ objdump -d main.elf

main.elf:     file format elf64-x86-64


Disassembly of section .text:

0000000000401000 <_Z4funcv>:
  401000:   f3 0f 1e fa             endbr64 
  401004:   55                      push   %rbp
  401005:   48 89 e5                mov    %rsp,%rbp
  401008:   c7 45 fc 0a 00 00 00    movl   $0xa,-0x4(%rbp)
  40100f:   c7 05 e7 2f 00 00 0a    movl   $0xa,0x2fe7(%rip)        # 404000 <global_variable>
  401016:   00 00 00 
  401019:   83 45 fc 01             addl   $0x1,-0x4(%rbp)
  40101d:   90                      nop
  40101e:   5d                      pop    %rbp
  40101f:   c3                      retq   

0000000000401020 <main>:
  401020:   f3 0f 1e fa             endbr64 
  401024:   55                      push   %rbp
  401025:   48 89 e5                mov    %rsp,%rbp
  401028:   48 83 ec 10             sub    $0x10,%rsp
  40102c:   c7 45 fc 04 00 00 00    movl   $0x4,-0x4(%rbp)
  401033:   c7 05 c3 2f 00 00 05    movl   $0x5,0x2fc3(%rip)        # 404000 <global_variable>
  40103a:   00 00 00 
  40103d:   83 45 fc 04             addl   $0x4,-0x4(%rbp)
  401041:   e8 ba ff ff ff          callq  401000 <_Z4funcv>
  401046:   b8 00 00 00 00          mov    $0x0,%eax
  40104b:   c9                      leaveq 
  40104c:   c3                      retq

Here we see the main function and the func function stripped of any unnecessary overhead to simplify the example. When we enter a function in C++, the code will push RBP on the stack, put RSP in RBP then decrement RSP of the allocated stack space for the function. This allocated stack space is always known directly at compile time because the space used by statically allocated variables is always known during compilation.

Afterwards, everything is either a relative offset from RIP (for accessing global variables) or a negative relative offset from RBP (for accessing local variables). In particular, the line movl $0x4,-0x4(%rbp) accesses the local variable called local_variable and places 4 into it. Then the line movl $0x5,0x2fc3(%rip) accesses the global variable called global_variable and makes it become 5.

When you allocate a variable with new, the compiler cannot know the size of the allocation at compile time because it is a dynamically allocated variable. The memory allocation will thus be compiled to putting the arguments in some registers and then using the syscall assembly instruction to get some memory.

Most of that is dynamically linked. It means that the standard library is not included in the executable but is instead linked with the executable by the dynamic linker at launch time of the executable. The functions of the standard library are defined in a library (libstdc++). This library is a shared object and contains all the symbols of the different C++ standard functions (including new).

When you call new from C++, the symbol of the function to call for allocating memory dynamically will be kept in the final executable. The address of that function (where to call to get to that function) will be determined before runtime (at launch time) by the dynamic loader. Since libstdc++ is a relocatable shared object, the position of the function can be anywhere. The dynamic loader will determine that using algorithms.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
user123
  • 2,510
  • 2
  • 6
  • 20
  • 1
    The default stack growth limit for the main thread (`ulimit -s`) is 8MiB on Linux. That's also the amount pthreads allocates with mmap for new thread stacks, with a non-growing mapping (only lazy allocation by the kernel avoiding wasting physical memory for untouched pages). [How is Stack memory allocated when using 'push' or 'sub' x86 instructions?](https://stackoverflow.com/q/46790666) – Peter Cordes Oct 20 '21 at 03:05
  • It's not just possible, it's what actually happens in Linux, as my linked answer shows with `/proc//smaps` output. Other OSes may differ, although I wouldn't be surprised if they also let the virtual mapping for the stack grow. (I think on Windows, your process can actually crash if you don't touch pages below the previous stack pages one at a time. So that's definitely new virtual allocation, not just allocating physical pages to back it, otherwise it wouldn't matter how you touched it, or in what order, or where ESP/RSP were.) – Peter Cordes Oct 20 '21 at 10:53
  • Reading more of your answer, *In physical memory, the stack has a default allocated size* isn't right in general. (Note that this isn't specifically a [Linux] question.) There is no block of physical memory set aside for the stack, on systems that use virtual memory. As you say it's allocated in response to page faults. I think the word "default" is what sounds weird for the amount of a virtual mapping that's currently backed by physical pages. It wasn't a default, it's just what's currently happening. (On Linux, it was only 8KiB of the 132K initial virtual stack in my linked answer.) – Peter Cordes Oct 20 '21 at 11:00
  • *It is allocated by simple means of offsets relative to RBP.* - Nope, that's how `-O0` debug builds (or with `-fno-omit-frame-pointer`) *access* stack space, but it's only allocated by moving RSP. (Or accessing in the 128-byte red-zone below RSP, on x86-64 System V ABI. That's guaranteed to be safe, even if it's in a different page than the current RSP.) – Peter Cordes Oct 20 '21 at 11:06
  • Thanks for feedback Peter. I think my answer may be wrong now but the general idea is good especially for the second part. I think its easy to get the point and work from there. I do my best to contribute facts but as you point out I have some lacunes myself as anybody would. – user123 Oct 20 '21 at 18:29
  • Yeah, definitely the right idea, just could use a few tweaks, if/when you or I get around to editing it. – Peter Cordes Oct 20 '21 at 22:10
  • *statically allocated* implies static storage class, e.g. `static` or global. Those sized *are* indeed known at compile time, but space for them is in .data, .bss, or .rodata, not on the stack. Stack space is used for automatic storage in normal C and C++ implementations. In ISO C++ it has to have a compile-time-constant size, too, but GNU C++ and even ISO C99 allow VLAs like `int arr[n]` where `n` is a `constexpr`. This does require extra code generation, so you should say just for this function, not "always". Also, using RBP as a frame pointer is fully optional, debug-mode default. – Peter Cordes Nov 15 '21 at 15:40
  • Yes, by statically allocated I mean that the size of the allocation is known during compilation. I didn't know how to say that otherwise. The RBP frame pointer will be used in a lot of cases. It is surely not a general fit all answer. There's definitely other cases and it is just meant as a starting point for someone who stumbles upon the question has an idea for what to look into. Thanks for precising though. Appreciated. – user123 Nov 15 '21 at 16:00
  • Feel free to edit anything. If you think the answer needs some rework. Maybe you should post your own answer. That way you could clarify things. – user123 Nov 15 '21 at 16:05