3

I allocated a large array on the stack after adjusting the stack size using setrlimit. When this array is declared in main() and then passed as argument to a method, I get a segmentation fault. When the array is declared as a local variable inside a method, the code runs without any seg fault. I am running the code on an Amdx86-64 linux box with 8GB RAM.

#include <iostream>
#include <sys/resource.h>

using usll = unsigned long long;
void sumArray0();
double sumArray1(double c[], usll dim); 

int main(int argc, char* argv[])
{
    const rlim_t stackSize = 3 * 1024UL * 1024UL * 1024UL;
    struct rlimit rl;
    int result;

    printf("The required value of stackSize is %lu\n", stackSize);  
    result = getrlimit(RLIMIT_STACK, &rl);

    if (result == 0)
    {
        if (rl.rlim_cur < stackSize)
        {
            rl.rlim_cur = stackSize;
            result = setrlimit(RLIMIT_STACK, &rl);

            if (result != 0)
            {
                fprintf(stderr, "setrlimit returned result = %d\n", result);
            }
            else
            {
                printf("The new value of stackSize is %lu\n", rl.rlim_cur);
            }
        }
    }

    // // This seg faults
    // const usll DIM = 20000UL * 18750UL;  
    // double c[DIM];

    // for (usll i{}; i<DIM; ++i)
    // {
    //     c[i] = 5.0e-6;
    // }
    // double total = sumArray1(c, DIM); // Seg fault occurs here

    sumArray0(); // This works

    std::cout << "Press enter to continue";
    std::cin.get();

    return 0;
}

void sumArray0()
{
    double total{};
    const usll DIM = 20000UL * 18750UL; 
    double c[DIM];

    for (usll i{}; i<DIM; ++i)
    {
        c[i] = 5.0e-6;
    }

    for (usll i{}; i<DIM; ++i)
    {
        total += c[i];
    }

    std::cout << "Sum of the elements of the vector is " << total << std::endl;
}

double sumArray1(double c[], usll dim)
{
    double total{};

    for (usll i{}; i<dim; ++i)
    {
        total += c[i];
    }

    return total;
}

My questions are:
Why am I getting a stackoverflow in the first case?
Is it because a new chunk of memory is requested in the call to the method sumArray1()?
Isn't the array accessed via a pointer when passed as argument to the method?

As recommended here, Giant arrays causing stack overflows, I always use std::vector and never allocate large arrays on the stack to prevent issues like above. I will highly appreciate it if anyone knows of any tweaks, tricks or workarounds that can make the call to sumArray1() work.

unbound37
  • 109
  • 6
  • 6
    *However, it is also true that the performance boost can be quite phenomenal when coding against the stack* Do you have any proof of that? A vector should be just as fast as an array as long as you turn on optimizations. The "stack" and "heap" are both in the RAM so one isn't intrinsically faster then the other. – NathanOliver Dec 10 '19 at 18:46
  • 4
    "*However, it is also true that the performance boost can be quite phenomenal when coding against the stack.*" No, what gives the performance boost is avoiding allocations, not the fact that it is the stack. – Acorn Dec 10 '19 at 18:47
  • Your code does not compile because you use `usll` before declaring it. Please don't shorten built-in names like that. It is hard to understand. You can use `std::uintmax_t` to get the largest possible type, but sizes of arrays are limited to `std::size_t` anyway. So use that. Also don't put `stuct` in front of the type in a variable declaration in C++. Is is unnecessary and potentially causes other problems. – walnut Dec 10 '19 at 18:47
  • *However, it is also true that the performance boost can be quite phenomenal when coding against the stack.* -- Most applications allocate once, enough memory, at startup or on "first use". After that, there is no "performance boost" you're getting. – PaulMcKenzie Dec 10 '19 at 18:58
  • Please also verify that the rest of the code is *exactly* what you ran to test this. In particular make sure that you did not add or remove any `cout`. – walnut Dec 10 '19 at 19:05
  • @walnut Sorry, that was an oversight. Edited and moved the using declaration to the top – unbound37 Dec 10 '19 at 19:06
  • @walnut Yes, this is exactly what I ran, no cout in the method sumArray1(), and I keep getting "Segmentation fault (core dumped)" – unbound37 Dec 10 '19 at 19:35
  • You can create arrays dynamically with pointer arithmetic like so: x[ 3 ] = 120; You are performing pointer arithmetic to set the 3 rd memory slot to the value 120. The bracket operator is just syntactic sugar—a term meaning special, simplified syntax—for doing pointer arithmetic. You can perform the same operation by writing: *( x + 3 ) = 120; If thats what you are looking for I can work the code in a working example in an answer if that helps. – Ingo Mi Dec 10 '19 at 19:38
  • @NathanOliver-ReinstateMonica I hope this is not going to be one of those claims that a vector (memory allocated at run time) could be just as fast as one allocated at compile time. If you know of any optimization switches to make this possible, could you kindly let us know? – unbound37 Dec 10 '19 at 19:45
  • 2
    @user11601099 No memory is allocated at compile time. When your program launches it allocates the space it needs for the "stack". If you had a vector, it would do the same thing. A raw array isn't going to be any faster then a `std::vector` and will just cause you problems like you have once you get big enough. – NathanOliver Dec 10 '19 at 19:47
  • @user11601099 Then I am wondering how you got a segmentation fault at all. Both GCC and Clang optimize-out the array in `main` completely if any optimization flag of at least `-O1` is given, because the result of the calculation is never used anywhere. This does not happen with `sumArray0`, because that is actually printing the result of the calculation. – walnut Dec 10 '19 at 20:38
  • @NathanOliver-ReinstateMonica I must admit you are right as I just noticed that using vector, reserve and emplace_back make it run just as fast. I have edited my original post accordingly. – unbound37 Dec 10 '19 at 20:39
  • 1
    Okay, so why not just use a `std::vector` since you know that'll work? – NathanOliver Dec 10 '19 at 20:45
  • @NathanOliver-ReinstateMonica I am not all that sure why that's happening. I built with g++ -Ofast -march=native -std=c++17 -Wall -W -fPIC. I should also point out that using 'sumArray1(std::vector)' on the same array reserved as vector did not cause segmentation. – unbound37 Dec 10 '19 at 21:00

1 Answers1

2

So it seems to me that you are compiling without optimizations enabled, because otherwise the array in main would be optimized-out completely, because the calculations done with it are never used in any observable manner. (This is bad for anything, really. Even for debugging purposes you should probably use -Og.)

In the comments you mention that you are using the -fPIC flag for compilation. This prevents gcc from inlining the sumArray1 call, so that the array cannot be optimized-out, no matter the optimization flags. You should probably use -fpie (which may already be the default) for an executable instead of -fPIC, which is meant for shared libraries and comes with these performance penalties, see also here and here.

If that is the case, then to answer your question: The problem is not the passing to a function. The problem is that stack space is allocated when the function is entered, so before the limits are set.

Now "allocating" here just means modifying the stack pointer, but it is very much possible that anywhere in this stack frame is accessed by main before the limit is set. In particular the compiler can reorder the location of the variables in the stack frame however it wants or it may add a stack guard depending on your compiler settings, etc.

Any such access before the limit is set would cause a segmentation fault.

Note that the compiler is also free to inline functions. So, even doing things in sumArray0 may cause you trouble if the compiler decides to inline that function call, because then the array will become part of main's stack frame with the same issues as discussed above applying.

The compiler may recognize that inlining a function with large stack frame is potentially dangerous and not do so, but that is something that you would need to check against in the compiler documentation.

In any case, compilers and operating systems do not expect programs to use large stack frames. That is not their purpose. The heap/free store is specifically there to handle memory allocations that are larger than usual stack frames. It is usually good practice to enable warnings for large stack frames and heed them.

walnut
  • 21,629
  • 4
  • 23
  • 59
  • As suggested by this answer, the issue was resolved when the -fPIC switch was replaced by -fpie. I still wish though that there was a solution that includes the -fPIC option since I'd need to link the code with other dso libraries. – unbound37 Dec 10 '19 at 21:55
  • @user11601099 As I mentioned in my answer, the problem is merely that you are setting the limit inside the same function using the large stack allocation. If you do the allocation in another function and make sure that it is not inlined (e.g. by using some compiler-specific attribute, or putting it in another shared object file or using `-fPIC`) then there will be no problem. However, usually the user is supposed to set the stack limit before calling the program. I would not consider it to be the job of the program to change that limit. It can test it and report to the user, who can then act. – walnut Dec 10 '19 at 22:09
  • Yes, I explored some of those options, including allocating the memory inside a constructor method, __attribute__((constructor(101)), both atop main() and inside a shared library but still got a seg fault. What is the switch for turning off function in-lining? – unbound37 Dec 10 '19 at 22:20
  • 1
    @user11601099 [How can I tell gcc not to inline a function?](https://stackoverflow.com/questions/1474030/how-can-i-tell-gcc-not-to-inline-a-function) You want to do the memory allocation *after* running the `setrlimit`. Just use another function as you did with `sumArray1` and give it the noinline attribute. If you are compiling with `-fPIC` it does't matter anyway, because that prohobts inlining by default. – walnut Dec 10 '19 at 22:22
  • Thanks for the link – unbound37 Dec 10 '19 at 22:30
  • This is probably what happens, but then it's a bug in GCC: "The storage is allocated at the point of declaration" quote from https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html – Daniel Jour Dec 10 '19 at 22:39
  • (And thus **not** "when the function is entered", as you claim) – Daniel Jour Dec 10 '19 at 22:40
  • @DanielJour That seems to refer specifically to variable-length arrays, which this is not (the size is declared `const` integral type and initialized by a constant expression). If you look at the generated code, you see that the stack pointer is decremented before anything from the body is executed (although as I said, this in itself does not imply the segmentation fault). – walnut Dec 10 '19 at 23:34
  • @walnut These **are** variable-length arrays. `const` means "protect against runtime modifications", not that this is a "constant expression" (that's `constexpr` in more recent standards). Consider: `const int n = ReadFromUser(); char buffer[n];` ... the size here is not a constant expression! – Daniel Jour Dec 11 '19 at 01:09
  • 1
    @DanielJour There is an explicit exception for `const` variables of integral type initialized by constant expressions, making their use constant expressions and therefore valid compile-time size arguments for arrays, see [\[expr.const\]/2.7.1](https://timsong-cpp.github.io/cppwp/n4659/expr.const#2.7.1). This exception predates the introduction of `constexpr`. You can also see that by adding the `-pedantic-errors` flag to g++, which would make it issue an error for variable-length arrays. – walnut Dec 11 '19 at 01:15
  • 1
    @DanielJour Your example is a variable-length array if and only if `ReadFromUser();` is not a constant expression (which given the naming it probably indeed isn't), but `20000UL * 18750UL` is a constant expression, making OP's use not a variable-length array and standard-compliant. – walnut Dec 11 '19 at 01:18
  • @walnut Oh, thank's for pointing that out. I didn't know about that "special case" :) – Daniel Jour Dec 11 '19 at 13:04