"Killed" while accessing a large array

Question

Why do I get "Killed" when I run this program?

#include <stddef.h>

int main() {
    constexpr size_t N{5'000'000'000};
    unsigned int *bigdata;
    bigdata = new unsigned int[N];
    for (size_t i=0; i<N; i++) 
        bigdata[i] = i;
    return 0;
}

Setting N to a much larger value, such as 50'000'000'000, results in an std::bad_alloc. With a smaller value, such as 1'000'000'000, the program works fine. But with 5'000'000'000, it fails quite unpredictably while writing to the array, abruptly printing "Killed" on the console and exiting with code 137. Is there any way to know what size of arrays it is safe to use on a given system?

Additional info

Someone marked this question as redundant, linking to How to get available memory C++/g++?. It is not unthinkable that this question has been answered before, but the linked page is not it. Using the code example provided in the top-rated answer still generates the same error. Here is the full program:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

unsigned long long getTotalSystemMemory() {
  long pages = sysconf(_SC_PHYS_PAGES);
  long page_size = sysconf(_SC_PAGE_SIZE);
  return pages * page_size;
}

int main() {

  size_t N = getTotalSystemMemory();
  printf("Size: %zu \n", N);

  unsigned int *bigdata = (unsigned int *)malloc(N);
  for (size_t i = 0; i < N / sizeof(unsigned int) - 1; i++)
    bigdata[i] = i;
  return 0;
}

The output:

Size: 16'439'369'728       // Separators added manually 
Killed

Some systems *overcommit* memory allocations, they allow you to allocate more than can actually be mapped to your process. That's because the memory isn't actually allocated when you ask for it, instead the pages or virtual memory will be mapped to your process when you actually use them, which you do in the loop. Once the OS notices that it's no longer possible to map pages to your process, it will kill your process. — Some programmer dude, Aug 01 '23 at 10:55
As for how to handle such large arrays, do you ***really*** need it? What kind of problem is that supposed to solve? There are many other data-structures that might be more suitable for that problem. — Some programmer dude, Aug 01 '23 at 10:56
When you create your array of size N here, you are asking your system to allocate 5 Gigabytes of continuous memory. This is absolutely enormous, and can fail in the two main cases: either you don't have enough RAM available, or there is no place in RAM where the 5GB you request can be allocated continuously. In both of these cases, it will kill your process (or to be more precise, your process will crash). As pointed out by another commenter, do you **really** an array this big? Most systems simply won't be able to run your code without crashing. — RedStoneMatt, Aug 01 '23 at 11:13
With this amount of data is probably better to use a memory mapped file. — Pepijn Kramer, Aug 01 '23 at 11:21
@Someprogrammerdude - should that be 20 Gigabytes? Assuming unsigned int is 4 bytes. — Mick Waites, Aug 01 '23 at 11:53
*Is there any way to know what size of arrays it is safe to use on a given system?* Change the code to `bigdata = new unsigned int[N]{};`. If it fails, it's too big. — Eljay, Aug 01 '23 at 12:05
@RedStoneMatt: The allocated 20 GiB need only to be continuous in *virtual* memory. With current 64-bit systems, that is not a lot. And even if you consider that current CPU typically only use 48-bit of virtual address, finding a free range of 20 GiB is nearly guaranteed. The problem is more how much the OS can afford to allocate, given that physical memory is much more limited than virtual addresses. — prapin, Aug 01 '23 at 12:07
My bad, I have written 5GB in my comment but it is indeed 20GB because an int is four bytes. This makes it even worse. — RedStoneMatt, Aug 01 '23 at 12:22
@prapin "With current 64-bit systems, that is not a lot" WHAT?! Most computers around have between 8 and 16 GB of RAM. Computers who have more are usually high-end professional or gaming computers. However yes, virtual memory indeed, therefore making it more doable, but "finding a free range of 20 GB is nearly guaranteed" absolutely not, very few people have this much free RAM. — RedStoneMatt, Aug 01 '23 at 12:24
So is there any way to find out how much continuous memory can be safely allocated on a given system? Adding curly brackets per Eljay didn't work, the process is still killed. @Someprogrammerdude, I don't *really* need it, and of course I can avoid it by structuring data differently. It's a question of learning what may or may not be done on a given system. — user23952, Aug 01 '23 at 12:47
@RedStoneMatt: The point is that it's not necessary to have 20 GB of free RAM, only 20 GB contiguous virtual address space, and virtual memory (which includes swapfile) to back it. No particular contiguity requirement applies to the backing memory. — Ben Voigt, Aug 01 '23 at 19:56

pts · Accepted Answer · 2023-08-03T10:33:53.493

Both Killed and exit code 137 (== 128 + 9 == 128 + SIGKILL) correspond to the SIGKILL signal, which is always fatal (it's not possible for the process to ignore or handle).

One reason for the Linux kernel to send the process a SIGKILL is terminating the process if it uses too much memory. More specifically, these are some possible cases:

The process tries to write to a previously unmapped virtual memory page, and there are no free pages in the system. The write is only possible after the virtual page becomes mapped to a physical memory page, but there is no such physical memory page free at the time. This situation is possible because of overcommit (see also comments below the question). When memory is allocated for global variables, with mmap or with malloc (with a huge size), the corresponding data pages are not mapped yet. It's possible the read the data and get zeros back, but to write the data, the pages have to be mapped. This mapping happens upon the first write. If there are no physical pages free, the kernel sends the process a SIGKILL signal, which terminates the process.

Interestingly, the man page mmap(2) documents SIGSEGV instead of SIGKILL.

Search for /proc/sys/vm/overcommit_memory in man proc(5) for documentation on how to disable or configure overcommit on Linux.
Linux also has the OOM killer which sends SIGKILL to some processes if the system runs out of memory. How quickly it reacts and whether it fits your observations well, I don't know.

To test it, run your process, and quickly run dmesg after your process has received a SIGKILL. Some OOM killer messages should show up in the dmesg kernel log output. To test it more, run sudo sysctl -w vm.oom-kill = 0 to disable the OOM killer, run your process again, and if SIGKILL doesn't happen, then it was the OOM killer (before it was disabled).

This behavior is perplexing. Engineers are using Linux to run airplanes and nuclear power plants. Some of the algorithms they use operate on large amounts of data. From what I read the kernel can unpredictably SIGKILL an application that didn't do anything wrong after letting it run perhaps for decades. Not the answer I was hoping for, but this is the correct answer. — user23952, Aug 01 '23 at 18:37
@user23952: People shouldn't choose Linux to run critical systems unless they know how to correctly configure it (e.g. disabling swap and vm overcommit) — Ben Voigt, Aug 01 '23 at 19:57
@BenVoigt well stated, but I will bet you real money that there are plenty engineers out there who operate critical systems who don’t know that they don’t know how to correctly configure those things you mention. Maybe at least they hang around at SO. — user23952, Aug 01 '23 at 20:12
@user23952: Please don't refer to those people as engineers. They would either be practicing engineering without a license, or if they do hold a license, in breach by practicing beyond their competency. Of course it's not required that every engineer know how to configure Linux; they only need to know that if they use it as the foundation of their system. — Ben Voigt, Aug 01 '23 at 20:41

"Killed" while accessing a large array

Additional info

1 Answers1