Memory considerations for an array in a function with a variable size

Question

I have the size of the array as a variable instead of as an actual number. For my program I call the function diagonalize three times with different values of array_size -- would the array be allocated and deallocated for each value of array_size, or would only one array be used and overwritten during the program? The code is below. Would it just be better to make three separate diagonalize functions which each internally declare an array with the size given by a unique global constant? Unfortunately I have to use arrays instead of vectors.

#include <mkl.h>
#include "mkl_lapacke.h"

void diagonalize(unsigned long long int array_size) {

lapack_complex_double z[array_size];

}

Details:

I am compiling with icpc -std=c++11 -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -ldl 10_site_main.cpp

icpc version 19.0.4.243 (gcc version 4.8.5 compatibility)

The size of an array doesn't need to be a literal. But it does need to be a constant expression. For example `constexpr int x = 10; int my_array[x];` is okay. — François Andrieux, Sep 17 '20 at 17:06
No, `constexpr` is a `const` but the reverse is not true. Not all `const` are constant expressions while all `constexpr` are. — François Andrieux, Sep 17 '20 at 17:09
So will this code run without issue if I change the function to `void diagonalize(const unsigned long long int array_size)` ? I just added the word `const`. — Christina Daniel, Sep 17 '20 at 17:14
@ChristinaDaniel No because `array_size` is not a constant expression. A constant expression is value that is determined at compile time and in a way that the compiler can know that value when it needs it. — François Andrieux, Sep 17 '20 at 17:15
Here' is a link to the [GCC documentation for their VLA extension.](https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html). Should answer most of the question. Or maybe not. It doesn't come right out and say it, but the comparison to `alloca` implies that it's allocated on the stack. Stack allocations are extremely cheap, but stacks are small in size, so throwing a large, arbitrary amount of data on the stack can easily cause a stack overflow. — user4581301, Sep 17 '20 at 17:30
As stated by the current duplicate, VLAs are non-Standard. To properly answer this question we need to know what compiler and version is being used so that we can reference the correct behavior in the compiler's documentation. — user4581301, Sep 17 '20 at 17:37
I am compiling with `icpc -std=c++11 -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread -ldl 10_site_main.cpp` — Christina Daniel, Sep 17 '20 at 17:51
Add that, along with icpc's version number, to the question where potential answerers can easily find it. I don't know the Intel compiler worth beans so I'm of no more help. — user4581301, Sep 17 '20 at 17:58

Maxim Egorushkin · Accepted Answer · 2020-09-17T21:11:19.480

In

void diagonalize(unsigned long long int array_size) {
    lapack_complex_double z[array_size];
}

z is variable-length array (VLA) feature from C99, C++ standard doesn't have it, but some C++ compilers support it as an extension for built-in types.

The default stack size is around 8MB on Linux, so using unsigned long long int for array_size is an overkill, unsigned would suffice.

Notably, Linux kernel got rid of all VLAs in 2018 because it can underflow/overflow the stack and corrupt it and hence provide vectors of attacks for kernel exploits.

Whether it underflows the stack depends on how much stack space is available when the function is called and that depends on the current call chain. In one call chain there can be a lot of stack space, in another - not so much. Which makes it hard to guarantee or prove that the function always has enough stack space for the VLA. Not using VLAs eliminates these opportunities to underflow/overflow the stack and corrupt it, which are the most popular and easy avenues for exploits.

would the array be allocated and deallocated for each value of array_size, or would only one array be used and overwritten during the program?

It is an automatic function local variable allocated in the function stack on each call. The allocation reserves the stack space and it normally takes one CPU instruction, like sub rsp, vla-size-in-bytes on x86-64. When a function returns all its automatic function variables cease to exist.

See https://en.cppreference.com/w/cpp/language/storage_duration, automatic storage duration for full details.

Would it just be better to make three separate diagonalize functions which each internally declare an array with the size given by a unique global constant?

It would make no difference because all automatic variables are destroyed and cease to exist when a function returns.

Memory considerations for an array in a function with a variable size

1 Answers1