8

I am curious whether it is possible to determine the maximum size that an array can have in C++.

    #include <iostream>

    using namespace std;

    #define MAX 2000000

    int main()
    {
      long array[MAX];
      cout << "Message" << endl;
      return 0;
    }

This compiles just fine, but then segfaults as soon as I run it (even though array isn't actually referenced). I know it's the array size too because if I change it to 1000000 it runs just fine.

So, is there some define somewhere or some way of having #define MAX MAX_ALLOWED_ARRAY_SIZE_FOR_MY_MACHINE_DEFINED_SOMEWHERE_FOR_ME?

I don't actually need this for anything, this question is for curiosity's sake.

Deanie
  • 2,316
  • 2
  • 19
  • 35
SirGuy
  • 10,660
  • 2
  • 36
  • 66
  • you just want to understand max array size in stack? maybe recursion works for you, then? not in a single array but in combined way? – tartar Mar 31 '12 at 05:29
  • 2
    @tar What does recursion have to do with this? – Cody Gray - on strike Mar 31 '12 at 05:29
  • This post doesn't answer your question, but should be an interesting read because it talks about the same seg fault problem: http://stackoverflow.com/questions/851122/large-2d-array-gives-segmentation-fault – K Mehta Mar 31 '12 at 05:29
  • if a recursive function without a base case creates a fixed size array again and again while printing level, then maybe we can say something about the approximate size of current stack? I am aware that recursive function calls occupy some space but still maybe as an approximation. – tartar Mar 31 '12 at 05:32
  • 3
    This array is NOT static. It's automatic. – R.. GitHub STOP HELPING ICE Mar 31 '12 at 05:35
  • @tar the array size is determined at compile time, how would I use recursion to figure it out? I could use templates but then it would just compile forever. Even if it did stop the program would just segfault right away anyways and I wouldn't know which array size caused the problem. – SirGuy Mar 31 '12 at 05:45
  • check my answer. it is not definite but still gives you a lower-bound – tartar Mar 31 '12 at 05:46
  • @CodyGray Just check the wiki page for the relation. local data and the function calls are all happening at stack. http://en.wikipedia.org/wiki/Call_stack – tartar Mar 31 '12 at 06:21
  • 1
    @tar I'm quite aware of what recursion *is*, I just don't understand what it has to do with this problem. Just because local variables and function calls all have information stored on the stack doesn't mean that you solve the solution with recursion. How do you recursively define objects to hold data? – Cody Gray - on strike Mar 31 '12 at 06:23
  • @CodyGray Just check my answer below. If recursive function also creates an array, then this local state will be held at the stack as well. Since the recursive call will not be returned due to missing base case, the program only terminates when stack space is consumed by recursive calls and their local data. http://stackoverflow.com/a/9953275/1273037 – tartar Mar 31 '12 at 06:27
  • @R.: The array is both statically sized (known at compile time) and automatic (scoped to a block and allocated/deallocated as part of the block). It is not `static` as in the C keyword, but that is not the only meaning of "static". –  Mar 31 '12 at 10:20
  • That's not a meaning of static used in the C language. The closest thing to that sense of the word "static" in C is "non-variably-modified". – R.. GitHub STOP HELPING ICE Mar 31 '12 at 19:36

4 Answers4

10

There isn't a way to determine this statically, because the actual limit depends on how much stack space your thread has been given. You could create a new thread, give it 10 megabytes of stack, and you would be able to allocate a correspondingly larger local array.

The amount you can allocate on the stack also depends on how much has already been used so far.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • That makes sense, I guess I don't understand too well the difference between how static arrays and dynamic arrays are handled in terms of memory allocation at runtime – SirGuy Mar 31 '12 at 05:29
  • 3
    When you declare a local variable, the space for the whole array is allocated from the stack space (this is unlike some other languages such as Java). When you declare something global, outside a function, the linker sets up the memory addresses so the space exists even before your program starts running. The stack has a limited size (usually 1 megabyte on a modern desktop OS, but can vary widely), while the static available space is the whole application address space of the machine (2 GB or so on a 32-bit system). – Greg Hewgill Mar 31 '12 at 05:34
  • Are there any performance differences between these two memory types? Also, _all_ programs use the same static space, even when they're not running? Is there some sharing that occurs in the static space or if you install a zillion programs you could just run out? I'm guessing issues with this don't come up very often then. – SirGuy Mar 31 '12 at 05:42
  • The CPU sees no difference between the two "kinds" of memory. They're just addresses with data in them. Address space is *not* shared between applications unless they explicitly cooperate to do so. Static data space is allocated at *run* time, not at install time. – Greg Hewgill Mar 31 '12 at 05:46
  • Okay, I was confused by the wording. – SirGuy Mar 31 '12 at 05:54
4
void foo(int level){
 int dummy[100];
 cerr<<level<<endl;
 foo(level+1); 
}

Then, maybe you can multiply the last printed level with 400 bytes. Recursive calls occupy most of the stack space I guess but you can get a lower-bound. I may miss some understanding of memory management here, so open to corrections.

So this is what I got on my machine with varying dummy array size.

level   array size    stack total
24257     100      9702800
 2597    1000     10388000
  260   10000     10400000
  129   20000     10320000
   64   40000     10240000
   25  100000     10000000
   12  200000      9600000
tartar
  • 688
  • 4
  • 16
  • I played with that solution for increasing size of integer array. Just to eliminate additional function call burden at each step. The max I got is when size of array is 10000 and the max stack size on my machine (ubuntu 10.04) is 10400000 bytes – tartar Mar 31 '12 at 05:57
  • Beware of tail call optimization here. –  Mar 31 '12 at 07:52
  • @daknok_t what's tail call optimization? – SirGuy Apr 02 '12 at 02:26
  • If the function returns directly after a function call, the called function can use the same stack frame. –  Apr 02 '12 at 11:24
3

Most variables declared inside functions are allocated on the stack, which is basically a block of memory of fixed size. Trying to allocate a stack variable larger than the size of the stack will cause a stack overflow, which is what the segfault is caused by.

Often the size of the stack is 8MB, so, on a 64-bit machine, long max[1000000] has size 8*1000000 < 8MB (the stack is "safe"), but long max[2000000] has size 8*2000000 > 8MB, so the stack overflows and the program segfaults.

On the other hand, dynamically allocating the array with malloc puts the memory into the heap, which is basically unbounded (4GB on a 32-bit machine, 17,179,869,184GB on a 64-bit machine).

huon
  • 94,605
  • 21
  • 231
  • 225
1

Please read this to understand the limitations that are set by hardware and compiler. I don't think that you can have a MAX defined for you to use it right away.

Community
  • 1
  • 1
Tejas Patil
  • 6,149
  • 1
  • 23
  • 38