19

For the following code:

foo(int n){
    int array[n];
}

I understand that this is invalid syntax and that it is invalid because the c++ standard requires array size to be set at compile time (although some compilers support the following syntax).

However I also understand the the following is valid syntax:

bar(int n){
    int *array = new int[n];
}

I don't understand why this is allowed, isn't it the same as creating an array where the size is determined at runtime? Is it good practice to do this or should I be using a vector if I need to do this instead?

Francis
  • 1,151
  • 1
  • 13
  • 29
  • 3
    So called variable length arrays. Here are some answers why they are not allowed in C++ -> https://groups.google.com/forum/#!topic/comp.std.c++/K_4lgA1JYeg They are valid in C BTW. – mip Dec 16 '14 at 00:55
  • 7
    The second case is heap-allocated, whereas the first (which is valid in C99) would be stack-allocated. – Thilo Dec 16 '14 at 00:55
  • 1
    In a version of C that supports VLAs, `int array[n];` creates an array object whose type is actually `int[n]`. `new int[n]` doesn't create an array type; it just yields an `int*` pointer that points to the first element of the allocated array. – Keith Thompson Dec 16 '14 at 01:28
  • @Thilo avoid predicting what will be on the heap and what on the stack. You can't know. I have a post discussion this here: http://stackoverflow.com/q/27298414/2003898 – dhein Dec 16 '14 at 10:20
  • @Zaibis so it is not stack/heap? The answer of the discussion suggests that it is the case for as good as any compiler. – MatthiasB Dec 16 '14 at 10:28
  • @MatthiasB I don't say this is wrong. But the point is: "heap" and "stack" are unequal to "storage types" where he says "The second case is heap-allocated" he had to say "The second case is allocated storage which is in [enviroment circumstances] allocated on the heap." But you cant predict what is on the heap and what will be on the stack just by the language. As it is decision of the enviroment how to treat stack/heap or even support this model. I don't say this won't be on the stack. I just want to clarify: You can't predict it without knowing OP's enviroment. – dhein Dec 16 '14 at 10:41
  • 1
    @Thilo Not a duplicate. The linked question asks why `int array[n]` is invalid. This question asks why `new int[n]` is *valid*. This means it's as far from being duplicate as possible. – milleniumbug Dec 16 '14 at 10:59

9 Answers9

27

That's because the former is allocated on the stack and the latter on the heap.

When you allocate something on the stack, knowing the size of the object is essential for correctly building it. C99 allows the size to be specified at run time, and this introduces some complications in building and dismantling the aforementioned stack, since you cannot calculate its size at compile time. Machine code must be emitted in order to perform said calculation during the execution of the program. This is probably the main reason why this feature wasn't included in the C++ standard.²

On the contrary, the heap has no fixed structure, as the name implies. Blocks of any size can be allocated with no particular order, as long as they do not overlap and you have enough (virtual) memory¹. In this case, knowing the size at compile time is not that relevant.

Also, remember that the stack has a limited size, mostly to detect infinite recursions before they consume all the available memory. Usually the limit is fixed around 1MB, and you rarely reach that. Unless you allocate large objects, which should be placed in the heap.

As of what you should use, probably a std::vector<int>. But it really depends on what you are trying to do.

Also note that C++11 has a std::array class, whose size must be known at compile time. C++14 should have introduced std::dynarray, but it was postponed because there is still much work to do concerning compile-time unknown size stack allocation.


¹ blocks are usually allocated sequentially for performance reasons, but that's not required.

² as pointed out, knowing the size at compile time is not a hard requirement, but it makes things simpler.

Stefano Sanfilippo
  • 32,265
  • 7
  • 79
  • 80
  • I would welcome `std::dynarray` and it could be allocated like `std::vector` on heap, because `std::vector` is not always what you need (it can grow, requires movable/copyable types, default constructors etc). – mip Dec 16 '14 at 01:18
  • @doc: Sounds like you want `std::unique_ptr`. Can be created with runtime size, but size can't change after creation, contents won't be moved. Doesn't help with default constructors, though. – Ben Voigt Dec 16 '14 at 01:31
  • @Stefano Sanfilippo: "That's because the former is allocated on the stack and the latter on the heap." This is simply wrong. The plain c standard doesn't even use the words "stack" or "heap" in it. and the c++ just drops them for his so called standard implemantation functions. But there COULD be a compiler that would do it vise versa and it would still be completly respecting the standards. Whats even more plausible: C code can be run in enviroments that doesn't even support stack/heap. So there can't it even be respected. I disscussed it here: http://stackoverflow.com/q/27298414/2003898 – dhein Dec 16 '14 at 10:32
  • @Zaibis pretty much what the answer to your question says. Although it's certainly possible to work without a stack and a heap, that was the reference model for elaborating the language specifications and the reason why VLAs are not in the C++ standard, IMHO. Also, I am not aware of any modern C/C++ compiler which does not follow this model, so the statement is probably true when considering this specific question. – Stefano Sanfilippo Dec 16 '14 at 13:07
8

In the first case you are allocating the memory space statically to hold the integers. This is done when the program is compiled and so the amount of storage is inflexible.

In the latter case you are dynamically allocating a memory space to hold the integers. This is done when the program is run, and so the amount of storage required can be flexible.

The second call is actually a function that talks to the operating system to go and find a place in memory to use. That same process does not happen in the first case.

fbrereto
  • 35,429
  • 19
  • 126
  • 178
5

int array[n] allocates a fixed-length array on the call stack at compile-time, and thus n needs to be known at compile-time (unless a compiler-specific extension is used to allow the allocation at runtime, but the array is still on the stack).

int *array = new int[n] allocates a dynamic-length array on the heap at run-time, so n does not need to be known at compile-time.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
5

The one and only valid answer to your question is, because the standard says so.

In contrast to C99, C++ never bothered to specify variable length arrays (VLAs), so the only way to get variably sized arrays is using dynamic allocation, with malloc, new or some other memory-manager.

In fairness to C++, having runtime-sized stack-allocations slightly complicates stack-unwinding, which would also make exception-handling for the functions using the feature consequently more bothersome.

Anyway, even if your compiler provides that C99-feature as an extension, it's a good idea to always keep a really tight rein on your stack-usage:
There is no way to recover from blowing the stack-limit, and the error-case is simply left Undefined Behavior for a reason.

The easiest way to simulate VLAs in C++, though without the performance-benefit of avoiding dynamic allocation (and the danger of blowing the limit):

unique_ptr<T[]> array{new T[n]};
T.C.
  • 133,968
  • 17
  • 288
  • 421
Deduplicator
  • 44,692
  • 7
  • 66
  • 118
  • 1
    It's not so much a matter of "didn't bother" as that VLAs are actively harmful in some respects. – Ben Voigt Dec 16 '14 at 01:13
  • Yes, it makes stack-unwinding (and thus also exception-handling) for those functions featuring VLAs a bit more involved. That's not too bad though. – Deduplicator Dec 16 '14 at 01:18
  • That and as you already mentioned, there's no way to indicate allocation failure. – Ben Voigt Dec 16 '14 at 01:19
  • @Deduplicator Interesting snippet with `unique_ptr`. However, would there be an instance where it is preferable to use this instead of just a `vector`? – Pradhan Dec 16 '14 at 01:48
  • @Pradhan: This is slightly more lightweight. Does it matter? In most cases, no. – Deduplicator Dec 16 '14 at 05:18
5

In the expression

new int[n]

int[n] is not the type. C++ treats "new with arrays" and "new with non-arrays" differently. The N3337 standard draft has this to say about new:

When the allocated object is an array (that is, the noptr-new-declarator syntax is used or the new-type-id or type-id denotes an array type), the new-expression yields a pointer to the initial element (if any) of the array.

The noptr-new-declarator refers to this special case (evaluate n and create the array of this size), see:

noptr-new-declarator:

    [ expression ] attribute-specifier-seqopt

    noptr-new-declarator [ constant-expression ] attribute-specifier-seqopt

However you can't use this in the "usual" declarations like

int array[n];

or in the typedef

typedef int variable_array[n];

This is different with C99 VLAs, where both are allowed.

Should I be using vectors instead?

Yes, you should. You should use vectors all the time, unless you have a very strong reason to do otherwise (there was one time during the last 7 years when I used new - when I was implementing vector for a school assignment).

Community
  • 1
  • 1
milleniumbug
  • 15,379
  • 3
  • 47
  • 71
4

This is because the C++ language does not have the C feature introduced in C99 known as "variable length arrays" (VLA).

C++ is lagging in adopting this C feature because the std::vector type from its library fulfills most of the requirements.

Furthermore, the 2011 standard of C backpedaled and made VLA's an optional feature.

VLA's, in a nutshell, allow you to use a run-time value to determine the size of a local array that is allocated in automatic storage:

int func(int variable)
{
   long array[variable]; // VLA feature

   // loop over array

   for (size_t i = 0; i < sizeof array / sizeof array[0]; i++) {
     // the above sizeof is also a VLA feature: it yields the dynamic
     // size of the array, and so is not a compile-time constant,
     // unlike other uses of sizeof!
   } 
}

VLA's existed in the GNU C dialect long before C99. In dialects of C without VLA's, array dimensions in a declaration must be constant expressions.

Even in dialects of C with VLA's, only certain arrays can be VLA's. For instance static arrays cannot be, and neither can dynamic arrays (for instance arrays inside a structure, even if instances of that structure are allocated dynamically).

In any case, since you're coding in C++, this is moot!

Note that storage allocated with operator new are not the VLA feature. This is a special C++ syntax for dynamic allocation, which returns a pointer type, as you know:

int *p = new int[variable];

Unlike a VLA's, this object will persist until it is explicitly destroyed with delete [], and can be returned from the surrounding scope.

Kaz
  • 55,781
  • 9
  • 100
  • 149
  • In my experience, `new T[n]` is pretty much the definition of a "dynamic array". You say that dynamic arrays cannot be VLAs, but the way I see it, all dynamic arrays are VLAs. The sentence also implies that arrays inside a structure are dynamic arrays, but that doesn't match the definition I'm familiar with. An array inside a struct may sometimes have a dynamic scope sometimes, but even that's pushing it. – Mooing Duck Dec 16 '14 at 01:14
  • @MooingDuck It is pretty obvious that my comments are about the ISO C 1999 variable-length array feature, and not about any arrays in any language whose length is somehow variable. – Kaz Dec 16 '14 at 01:18
  • 1
    Having left and come back, I realize you're associating the dynamic with "dynamic length", wheras I was associating the dynamic with "dynamic scope". In this light, your post is not wrong, so I removed my first comment. – Mooing Duck Dec 16 '14 at 01:22
  • @MooingDuck The "dynamic scope" that I know is in Lisp, and related languages. "Dynamic", in some languages, also refers to that storage which C calls automatic. Lisp is one, and, I seem to recall, (ironically!) BCPL, a predecessor to C. Ritchie decided to rename BCPL's dynamic storage as "automatic". :) – Kaz Dec 16 '14 at 01:24
  • I checked, I used the wrong word. I was thinking of "dynamic storage duration" in §3.7.4. For completeness sake, other uses of "dynamic" in C++ are "dynamic initialization" (globals initialized as part of startup), "dynamic types" (more derived types and dynamic_cast), "dynamic object" (a variable???), "dynamic binding" (polymorphic classes), "dynamic exception specification" (variadic template magic), exceptions are caught by "dynamically" surrounding try blocks... the list is longer than I expected. – Mooing Duck Dec 16 '14 at 01:50
  • 1
    @MoongDuck: Aha, that "exceptions caught by dynamically surrounding blocks" is evidence of the terminology from other languages creeping back in. That sense of "dynamic" coincides with Lisp's "dynamic extent", and BCPL's "dynamic storage": having to do with procedure or block activations, and the stack. – Kaz Dec 16 '14 at 01:54
4

No, the second is not declaring an array. It's using the array form of operator new, and that specifically permits the first dimension to be variable.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
4

Because it has different semantics:

If n is a compile-time constant (unlike in your example):

int array[n]; //valid iff n is compile-time constant, space known at compile-time

But consider when n is a runtime value:

int array[n]; //Cannot have a static array with a runtime value in C++
int * array = new int[n]; //This works because it happens at run-time, 
                          // not at compile-time! Different semantics, similar syntax.

In C99 you can have a runtime n for an array and space will be made in the stack at runtime. There are some proposals for similar extensions in C++, but none of them is into the standard yet.

Germán Diago
  • 7,473
  • 1
  • 36
  • 59
3

You can allocate memory statically on the stack or dynamically on the heap.

In your first case, your function contains a declaration of an array with a possible variable length, but this is not possible, since raw arrays must have fixed size at compile time, because they are allocated on the stack. For this reason their size must be specified as a constant, for example 5. You could have something like this:

foo(){
    int array[5]; // raw array with fixed size 5
}

Using pointers you can specify a variable size for the memory that will be pointed, since this memory will be allocated dynamically on the heap. In your second case, you are using the parameter n to specify the space of memory that will be allocated.

Concluding, we can say that pointers are not arrays: the memory allocated using a pointer is allocated on the heap, whereas the memory allocated for a raw array is allocated on the stack.

There are good alternatives to raw arrays, for example the standard container vector, which is basically a container with variable length size.

Make sure you understand well the difference between dynamic and static memory allocation, the difference between memory allocated on the stack and memory allocated on the heap.

Community
  • 1
  • 1
nbro
  • 15,395
  • 32
  • 113
  • 196
  • Ideally `int arr[n]` is wrong and should not be practiced. But, still many new CPP learners do that and it still works. Why does that happen? – asn Jul 19 '19 at 09:46