How does the global variable declaration solve the stack overflow in C?

Question

I have some C code. What it does is simple, get some array from io, then sort it.

#include <stdio.h>
#include <stdlib.h>

#define ARRAY_MAX 2000000

int main(void) {
    int my_array[ARRAY_MAX];
    int w[ARRAY_MAX];
    int count = 0;

    while (count < ARRAY_MAX && 1 == scanf("%d", &my_array[count])) {
        count++;
    }

    merge_sort(my_array, w, count);
    return EXIT_SUCCESS;
}

And it works well, but if I really give it a group of number which is 2000000, it cause a stack overflow. Yes, it used up all the stack. One of the solution is to use malloc() to allocate a memory space for these 2 variables, to move them to the heap, so no problem at all.

The other solution is to move the below 2 declaration to the global scope, to make them global variables.

    int my_array[ARRAY_MAX];
    int w[ARRAY_MAX];

My tutor told me that this solution does the same job: to move these 2 variables into the heap.

But I checked some documents online. Global variables, without initialisation, they will reside in the bss segment, right?

I checked online, the size of this section is just few bytes. How could it prevent the stack overflow?

Or, because these 2 types are array, so they are pointers, and global pointers reside in data segment, and it indicates the size of data segment can be dynamically changed as well?

Short answer: your OS and the compiler are smart. If you declare a large global variable but don't assign it a value (outside of a function), the only thing that is stored in the executable file is the size and location of the variable. Once you load the executable, the OS will 'allocate' enough memory to hold the entire variable (it might not actually allocate RAM until you access the variable). — Ethan Reesor, Aug 12 '16 at 04:26
You should probably look at [Why is the bss segment required?](http://stackoverflow.com/questions/9535250/) I'm not sure whether this is a duplicate of that, but they're very closely related. — Jonathan Leffler, Aug 12 '16 at 04:27
@EthanReesor Thanks, but where does it store once I load the executable? As your context, In the heap? — Albert Gao, Aug 12 '16 at 04:48
AFAIK it's OS specific, but generally the BSS will be placed in a read-write section of the process' (virtual) memory space adjacent to the code; **definitely not the heap**. The difference between the BSS section and the data section is negligible if not nonexistent at runtime. The real difference is how the two sections are encoded in the executable. The data section stores the value of variables in the executable whereas the BSS section only stores size and location. If you're interested, tomorrow (PST) I'll make a detailed answer on process memory, including an overview of virtual memory. — Ethan Reesor, Aug 12 '16 at 05:25
@EthanReesor Thanks! Thanks for the clarification! Now I think I get the answer now :) But if you could provide more details, it would be very nice — Albert Gao, Aug 12 '16 at 05:53
Details have been provided. If anything is confusing, please comment. — Ethan Reesor, Aug 12 '16 at 18:28

score 2 · Accepted Answer · answered Aug 12 '16 at 04:20

2

The bss (block started by symbol) section is tiny in the object file (4 or 8 bytes) but the value stored is the number of bytes of zeroed memory to allocate after the initialized data.

It avoids the stack overflow by allocating the storage 'not on the stack'. It is normally in the data segment, after the text segment and before the start of the heap segment — but that simple memory picture can be more complicated these days.

Officially, there should be caveats about 'the standard doesn't say that there must be a stack' and various other minor bits'n'pieces, but that doesn't alter the substance of the answer. The bss section is small because it is a single number — but the number can represent an awful lot of memory.

answered Aug 12 '16 at 04:20

Jonathan Leffler

730,956
141
904
1,278

Thanks, I get you, your last sentence explains it. But the question is, how can it store some bytes in the bss segment first? In this phase, the program only compiled, but not run. There is no way it knows what exactly the size of these 2 global variables should be, right? – Albert Gao Aug 12 '16 at 04:53
All global variables in a C program have a size that's fixed at compile time, so the compiler (or, at least, the linker invoked by the compiler) _does_ know how big those global variables must be. – Jonathan Leffler Aug 12 '16 at 04:54
Aha, there it is, a prerequisite. Thanks, a new question, does it mean, these 2 global variables are reside in the bss first, then in the runtime, when I use them, they will get their space in the data segment. And the data segment can be dynamically changed, so it solves this problem? – Albert Gao Aug 12 '16 at 04:57
See the question I cross-referenced in another [comment](http://stackoverflow.com/questions/38909811/how-does-the-global-variable-declaration-solve-the-stack-overflow-in-c/38909853?noredirect=1#comment65177629_38909811). Once the program is loaded, there's no significant difference between the data and bss segments; it is effectively all data segment. The only difference is that the 'data' part in the object file has some non-zero initialization (often with a lot of zero bytes around too), but the bss is all zero bytes. Once that's sorted out, there's no difference. – Jonathan Leffler Aug 12 '16 at 04:59
Thanks, read it thoroughly. Could you check if I am right know? 1. all the 2 global variables are stored in the bss segment before the runtime. 2. When the program runs, the system will initilze the variables in bss even it is not assigned by coder. And it will assign 0 to it. 3. Since we declares an array which size is 20000000, The system will assign so many 0 for it. 4. And by this way, it solves the problem obviously. 5. And this means the size of bss gets bigger since at we've implicitly assigned bunches of 0 into it. This time, i am right, right? – Albert Gao Aug 12 '16 at 05:30
Basically correct, yes. It depends on precisely what you mean in (5). If you mean 'the number stored in the bss slot in the executable will be larger by the size of the two global arrays' — then yes; if you mean 'the space in the executable used by the bss will be larger' — then no. I'll assume you mean the first. In (2) you use 'even' which should perhaps be 'only' — as in, the system initializes variables in bss either if the programmer did not initialize the data to non-zero values, or if the programmer explicitly initialized them all to zero. – Jonathan Leffler Aug 12 '16 at 05:34
Thanks! All of this forms the answer! Thanks so much for your time! In the second paragraph of your answer, you mentioned it is normally in the data segment. Shouldn't it be in the bss segment as we discussed before? – Albert Gao Aug 12 '16 at 05:51
I've also seen ("*Block Storage by Symbol*"). I can't find the link, but there is a historical thread that attempts to trace the exact origin of the `bss` acronym. It too may be lost to history `:)` – David C. Rankin Aug 12 '16 at 18:49
@DavidC.Rankin: AcronymFinder list [BSS](http://www.acronymfinder.com/BSS.html) with both, though 'block started by symbol' is given 3 stars and 'block storage by symbol' is given 1 (out of 5 possible). Wikipedia mentions IBM 704 assembler when describing [`.bss`](https://en.wikipedia.org/wiki/.bss) — which is what I was told or read was the case at some point in the dim distant past. I can't find a definition for it in the Unix 7th Edition manuals (bss appears a fair amount; no definition of what it stands for, though). – Jonathan Leffler Aug 12 '16 at 18:58
@AlbertGao: Missed the question in your last comment. Once the program is loaded in memory, there is no meaningful difference between the data segment and the bss segment, and it is all typically treated as a single extent of memory. Effectively, the bss segment is an artefact of the the executable format that describes a chunk of the data segment with a single number. If you prefer to distinguish the data and bss segments after the program is loaded, there's no particular harm. (If you find the 7th Edition Unix manuals online, the END(3) entry describes `end`, `etext` and `edata`.) – Jonathan Leffler Aug 14 '16 at 04:00

Ethan Reesor · Answer 2 · 2016-08-12T18:39:17.930

Disclaimer: This is not a guide, it is an overview. It is based on how Linux does things, though I may have gotten some details wrong. Most (desktop) operating systems use a very similar model, with different details. Additionally, this only applies to userspace programs. Which is what you're writing unless you're developing for the kernel or working on modules (linux), drivers (windows), kernel extensions (osx).

Virtual Memory: I'll go into more detail below, but the gist is that each process gets an exclusive 32-/64-bit address space. And obviously a process' entire address space does not always map to real memory. This means A) one process' addresses mean nothing to another process and B) the OS decides which parts of a process' address space are loaded into real memory and which parts can stay on disk, at any given point in time.

Executable File Format

Executable files have a number of different sections. The ones we care about here are .text, .data, .bss, and .rodata. The .text section is your code. The .data and .bss sections are global variables. The .rodata section is constant-value 'variables' (aka consts). Consts are things like error strings and other messages, or perhaps magic numbers. Values that your program needs to refer to but never change. The .data section stores global variables that have an initial value. This includes variables defined as <type> <varname> = <value>;. E.g. a data structure containing state variables, with initial values, that your program uses to keep track of itself. The .bss section records global variables that do not have an initial value, or that have an initial value of zero. This includes variables defined as <type> <varname>; and <type> <varname> = 0;. Since the compiler and the OS both know that variables in the .bss section should be initialized to zero, there's no reason to actually store all of those zeros. So the executable file only stores variable metadata, including the amount of memory that should be allocated for the variable.

Process Memory Layout

When the OS loads your executable, it creates six memory segments. The bss, data, and text segments are all located together. The data and text segments are loaded (not really, see virtual memory) from the file. The bss section is allocated to the size of all of your uninitialized/zero-initialized variables (see VM). The memory mapping segment is similar to the data and text segments in that it consists of blocks of memory that are loaded (see VM) from files. This is where dynamic libraries are loaded.

The bss, data, and text segments are fixed-size. The memory mapping segment is effectively fixed-size, but it will grow when your program loads a new dynamic library or uses another memory mapping function. However, this does not happen often and the size increase is always the size of the library or file (or shared memory) being mapped.

The Stack

The stack is a bit more complicated. A zone of memory, the size of which is determined by the program, is reserved for the stack. The top of the stack (low memory address) is initialized with the main function's variables. During execution, more variables may be added to or removed from the bottom of the stack. Pushing data onto the stack 'grows' it down (higher memory address), increasing stack pointer (which maintains the address of the bottom of the stack). Popping data off the stack shrinks it up, reducing the stack pointer. When a function is called, the address of the next instruction in the calling function (the return address, within the text segment) is pushed onto the stack. When a function returns, it restores the stack to the state it was in before the function was called (everything it pushed onto the stack is popped off) and jumps to the return address.

If the stack grows too large, the result is dependent on many factors. Sometimes you get a stack overflow. Sometimes the run-time (in your case, the C runtime) tries to allocate more memory for the stack. This topic is beyond the scope of this answer.

The Heap

The heap is used for dynamic memory allocation. Memory allocated with one of the alloc functions lives on the heap. All other memory allocations are not on the heap. The heap starts as a large block of unused memory. When you allocate memory on the heap, the OS tries to find space within the heap for your allocation. I'm not going to go over how the actual allocation process works.

Virtual Memory

The OS makes your process think that it has the entire 32-/64-bit memory space to play in. Obviously, this is impossible; often this would mean your process had access to more memory than your computer physically has; on a 32-bit processor with 4GB of memory, this would mean your process had access to every bit of memory, with no room left for other processes.

The addresses that your process uses are fake. They do not map to actual memory. Additionally, most of the memory in your process' address space is inaccessible, because it refers to nothing (on a 32-bit processor it may not be most). The ranges of usable/valid addresses are partitioned into pages. The kernel maintains a page table for each process.

When your executable is loaded and when your process loads a file, in reality, it is mapped to one or more pages. The OS does not necessarily actually load that file into memory. What it does is create enough entries in the page table to cover the entire file while notating that those pages are backed by a file. Entries in the page table have two flags and an address. The first flag (valid/invalid) indicates whether or not the page is loaded in real memory. If the page is not loaded, the other flag and the address are meaningless. If the page is loaded, the second flag indicates whether or not the page's real memory has been modified since it was loaded and the address maps the page to real memory.

The stack, heap, and bss work similarly, except they are not backed by a 'real' file. If the OS decides that one of your process' pages isn't being used, it will unload that page. Before it unloads the page, if the modified flag is set in the page table for that page, it will save the page to disk somewhere. This means that if a page in the stack or heap is unloaded, a 'file' will be created that now maps to that page.

When your process tries to access a (virtual) memory address, the kernel/memory management hardware uses the page table to translate that virtual address to a real memory address. If the valid/invalid flag is invalid, a page fault is triggered. The kernel pauses your process, locates or makes a free page, loads part of the mapped file (or fake file for the stack or heap) into that page, sets the valid/invalid flag to valid, updates the address, then reruns the original instruction that triggered the page fault.

AFAIK, the bss section is a special page or pages. When a page in this section is first accessed (and triggers a page fault), the page is zeroed before the kernel returns control to your process. This means that the kernel doesn't pre-zero the entire bss section when your process is loaded.