152

I want to create a program that will simulate an out-of-memory (OOM) situation on a Unix server. I created this super-simple memory eater:

#include <stdio.h>
#include <stdlib.h>

unsigned long long memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
void *memory = NULL;

int eat_kilobyte()
{
    memory = realloc(memory, (eaten_memory * 1024) + 1024);
    if (memory == NULL)
    {
        // realloc failed here - we probably can't allocate more memory for whatever reason
        return 1;
    }
    else
    {
        eaten_memory++;
        return 0;
    }
}

int main(int argc, char **argv)
{
    printf("I will try to eat %i kb of ram\n", memory_to_eat);
    int megabyte = 0;
    while (memory_to_eat > 0)
    {
        memory_to_eat--;
        if (eat_kilobyte())
        {
            printf("Failed to allocate more memory! Stucked at %i kb :(\n", eaten_memory);
            return 200;
        }
        if (megabyte++ >= 1024)
        {
            printf("Eaten 1 MB of ram\n");
            megabyte = 0;
        }
    }
    printf("Successfully eaten requested memory!\n");
    free(memory);
    return 0;
}

It eats as much memory as defined in memory_to_eat which now is exactly 50 GB of RAM. It allocates memory by 1 MB and prints exactly the point where it fails to allocate more, so that I know which maximum value it managed to eat.

The problem is that it works. Even on a system with 1 GB of physical memory.

When I check top I see that the process eats 50 GB of virtual memory and only less than 1 MB of resident memory. Is there a way to create a memory eater that really does consume it?

System specifications: Linux kernel 3.16 (Debian) most likely with overcommit enabled (not sure how to check it out) with no swap and virtualized.

Petr
  • 13,747
  • 20
  • 89
  • 144
  • 18
    maybe you have to actually use this memory (i.e. write to it)? – m.s. Oct 20 '15 at 10:26
  • Maybe you actually need to use the memory or your compiler will just optimise that away? – Magisch Oct 20 '15 at 10:27
  • 4
    I don't think compiler optimizes it, if that was true, it wouldn't allocate 50GB of virtual memory. – Petr Oct 20 '15 at 10:28
  • 20
    @Magisch I don't think it's the compiler but the OS like copy-on-write. – cadaniluk Oct 20 '15 at 10:28
  • 1
    @Petr In any case, try writing something to it while eating. – Magisch Oct 20 '15 at 10:30
  • 4
    You are right, I tried to write to it and I just nuked my virtual box... – Petr Oct 20 '15 at 10:34
  • 4
    The original program will behave as you expected if you do `sysctl -w vm.overcommit_memory=2` as root; see http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting . Note that this may have other consequences; in particular, very large programs (e.g. your web browser) may fail to spawn helper programs (e.g. the PDF reader). – zwol Oct 20 '15 at 12:44
  • This answer http://stackoverflow.com/questions/25000496/python-script-terminated-by-sigkill-rather-than-throwing-memoryerror also has some usefull information on the oom-killer and how to test with it. – rkh Oct 20 '15 at 16:38
  • You realloc memory to `(eaten_memory * 1024) + 1024` which is `(0*1024) + 1024 ----> 1024`. So you totally get 1024 bytes at all. Wouldn't you mean `memory_to_eat` instead of the variable you use??? – Luis Colorado Oct 21 '15 at 18:15
  • @LuisColorado I don't understand what you mean, realloc with `eaten_memory` of 0 is never called. It's called when it's 1 or more. This variable is incremented every time the function is called. – Petr Oct 22 '15 at 08:59
  • @Petr: by the way, you can pass a null pointer to `realloc`, in which case it behaves the same as a `malloc`. – Steve Jessop Oct 22 '15 at 11:04
  • @SteveJessop OK, I am primarily a C++ programmer, so I am not very familiar with these C functions, I updated the code so that it's more readable. Thanks! – Petr Oct 23 '15 at 10:06

5 Answers5

224

When your malloc() implementation requests memory from the system kernel (via an sbrk() or mmap() system call), the kernel only makes a note that you have requested the memory and where it is to be placed within your address space. It does not actually map those pages yet.

When the process subsequently accesses memory within the new region, the hardware recognizes a segmentation fault and alerts the kernel to the condition. The kernel then looks up the page in its own data structures, and finds that you should have a zero page there, so it maps in a zero page (possibly first evicting a page from page-cache) and returns from the interrupt. Your process does not realize that any of this happened, the kernels operation is perfectly transparent (except for the short delay while the kernel does its work).

This optimization allows the system call to return very quickly, and, most importantly, it avoids any resources to be committed to your process when the mapping is made. This allows processes to reserve rather large buffers that they never need under normal circumstances, without fear of gobbling up too much memory.


So, if you want to program a memory eater, you absolutely have to actually do something with the memory you allocate. For this, you only need to add a single line to your code:

int eat_kilobyte()
{
    if (memory == NULL)
        memory = malloc(1024);
    else
        memory = realloc(memory, (eaten_memory * 1024) + 1024);
    if (memory == NULL)
    {
        return 1;
    }
    else
    {
        //Force the kernel to map the containing memory page.
        ((char*)memory)[1024*eaten_memory] = 42;

        eaten_memory++;
        return 0;
    }
}

Note that it is perfectly sufficient to write to a single byte within each page (which contains 4096 bytes on X86). That's because all memory allocation from the kernel to a process is done at memory page granularity, which is, in turn, because of the hardware that does not allow paging at smaller granularities.

cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106
  • 6
    It's also possible to commit memory with `mmap` and `MAP_POPULATE` (though note that the man page says "_MAP_POPULATE is supported for private mappings only since Linux 2.6.23_"). – Toby Speight Oct 20 '15 at 18:07
  • 2
    That's basically right, but I think the pages are all copy-on-write mapped to a zeroed page, rather than not present at all in the page-tables. This is why you have to write, not just read, every page. Also, another way to use up physical memory is to lock the pages. e.g. call `mlockall(MCL_FUTURE)`. (This requires root, because `ulimit -l` is only 64kiB for user accounts on a default install of Debian/Ubuntu.) I just tried it on Linux 3.19 with the default sysctl `vm/overcommit_memory = 0`, and locked pages use up swap / physical RAM. – Peter Cordes Oct 21 '15 at 04:00
  • @cmaster What about 4MB pages over PSE? And 1GB with x86-64? OK, the program will work fine anyway but some bytes are unnecessarily written to the buffer. If you assess this as true, please mention these other page sizes in your answer too. I could edit it myself, I guess, but I'd like someone to review that before. – cadaniluk Oct 21 '15 at 07:12
  • 3
    @cad While the X86-64 supports two larger page sizes (2 MiB and 1 GiB), they are still treated quite special by the linux kernel. For instance, they are only used on explicit request, and only if the system has been configured to allow them. Also, the 4 kiB page still remains the granularity at which memory may be mapped. That's why I don't think that mentioning huge pages adds anything to the answer. – cmaster - reinstate monica Oct 21 '15 at 07:19
  • Doesn't this break the whole "if you don't get a nullpointer back it means, I, the OS have committed that memory unto you"? – Alec Teal Oct 21 '15 at 20:17
  • 1
    @AlecTeal Yes, it does. That's why, at least on linux, it's more likely that a process that consumes too much memory is shot by the out-of-memory-killer than that one of it's `malloc()` calls returns `null`. That's clearly the downside of this approach to memory management. However, it is already the existence of copy-on-write-mappings (think dynamic libraries and `fork()`) that make it impossible for the kernel to know how much memory will actually be needed. So, if it didn't overcommit memory, you would run out of mapable memory long before you were actually using all the physical memory. – cmaster - reinstate monica Oct 21 '15 at 23:39
  • @cmaster I always thought it'd never commit beyond what it could do if you added in swap space. Which made sense. However I imagine that this was done with good reason, such a method wouldn't get any support unless there was significant evidence for it. – Alec Teal Oct 21 '15 at 23:46
  • 1
    Isn't it a page fault not a seg fault? – Bill Barth Oct 22 '15 at 00:02
  • 2
    @BillBarth To the hardware there is no difference between what you would call a page fault and a segfault. The hardware only sees an access that violates the access restrictions laid down in the page tables, and signals that condition to the kernel via a segmentation fault. It's only the software side that then decides whether the segmentation fault should be handled by supplying a page (updating the page tables), or whether a `SIGSEGV` signal should be delivered to the process. – cmaster - reinstate monica Oct 22 '15 at 01:53
29

All the virtual pages start out copy-on-write mapped to the same zeroed physical page. To use up physical pages, you can dirty them by writing something to each virtual page.

If running as root, you can use mlock(2) or mlockall(2) to have the kernel wire up the pages when they're allocated, without having to dirty them. (normal non-root users have a ulimit -l of only 64kiB.)

As many others suggested, it seems that the Linux kernel doesn't really allocate the memory unless you write to it

An improved version of the code, which does what the OP was wanting:

This also fixes the printf format string mismatches with the types of memory_to_eat and eaten_memory, using %zi to print size_t integers. The memory size to eat, in kiB, can optionally be specified as a command line arg.

The messy design using global variables, and growing by 1k instead of 4k pages, is unchanged.

#include <stdio.h>
#include <stdlib.h>

size_t memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
char *memory = NULL;

void write_kilobyte(char *pointer, size_t offset)
{
    int size = 0;
    while (size < 1024)
    {   // writing one byte per page is enough, this is overkill
        pointer[offset + (size_t) size++] = 1;
    }
}

int eat_kilobyte()
{
    if (memory == NULL)
    {
        memory = malloc(1024);
    } else
    {
        memory = realloc(memory, (eaten_memory * 1024) + 1024);
    }
    if (memory == NULL)
    {
        return 1;
    }
    else
    {
        write_kilobyte(memory, eaten_memory * 1024);
        eaten_memory++;
        return 0;
    }
}

int main(int argc, char **argv)
{
    if (argc >= 2)
        memory_to_eat = atoll(argv[1]);

    printf("I will try to eat %zi kb of ram\n", memory_to_eat);
    int megabyte = 0;
    int megabytes = 0;
    while (memory_to_eat-- > 0)
    {
        if (eat_kilobyte())
        {
            printf("Failed to allocate more memory at %zi kb :(\n", eaten_memory);
            return 200;
        }
        if (megabyte++ >= 1024)
        {
            megabytes++;
            printf("Eaten %i  MB of ram\n", megabytes);
            megabyte = 0;
        }
    }
    printf("Successfully eaten requested memory!\n");
    free(memory);
    return 0;
}
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Magisch
  • 7,312
  • 9
  • 36
  • 52
  • Yes you are right, it was the reason, not sure about technical background though, but it makes sense. It's weird though, that it allows me to allocate more memory than I can actually use. – Petr Oct 20 '15 at 10:44
  • I do think on OS level the memory is only really used when you write into it, which makes sense considering the OS doesnt keep tabs on all the memory you theorethically have, but only on that which you actually use. – Magisch Oct 20 '15 at 10:45
  • @Petr mind If I mark my answer as community wiki and you edit in your code for future user readability? – Magisch Oct 20 '15 at 10:46
  • @Petr It's not weird at all. That's how memory management on today's OSes works. A major trait of processes is that they have distinct address spaces, which is accomplished by providing each of them a virtual address space. x86-64 supports 48-bits for one virtual address, with even 1GB pages, so, in theory, some Terabytes of memory **per process** are possible. Andrew Tanenbaum has written some great books about OSes. If you're interested, read them! – cadaniluk Oct 20 '15 at 10:47
  • @Magisch I don't mind modifying anything, you can change my code to make it easier to read, of course. Maybe just keep the defunct version so that it's clear to people what the problem in my question was. – Petr Oct 20 '15 at 10:48
  • @Petr Alright, I removed the edit from your question, edited it into this answer, and made this answer into a community wiki post. – Magisch Oct 20 '15 at 10:52
  • 1
    I wouldn't use wording "obvious memory leak" I don't believe that overcommit or this technology of "memory copy on write" was invented to deal with memory leaks at all. – Petr Oct 20 '15 at 10:58
  • @Petr feel free to edit the Answer to your liking, as it is now a community wiki post and open to content contribs from all users. – Magisch Oct 20 '15 at 10:59
  • @Magisch, I'm not sure what you mean by "keep tabs on". The OS's _page tables_ describe the status of every page in a program's virtual address space regardless of whether the page is valid or invalid, demand-zero filled, resident in physical memory, mapped to a file, out in the swap partition, copy-on-write, mapped to I/O devices, ... – Solomon Slow Oct 20 '15 at 13:39
  • @jameslarge what I mean by that is the Kernel does not actually translate virtual memory (e.g Allocated memory via malloc or similar) into physical memory until you write in it. – Magisch Oct 20 '15 at 13:40
  • @cad: not all modern operating systems work that way. In particular, Windows does not; malloc() will fail if Windows can't commit the memory for it. (It doesn't have to set aside physical RAM straight away, of course, it might just allocate space in the pagefile. But it never overcommits itself.) – Harry Johnston Oct 21 '15 at 00:36
  • @HarryJohnston Oh, OK, I only know about Linux and thought that applies to Windows as well. – cadaniluk Oct 21 '15 at 07:05
13

A sensible optimisation is being made here. The runtime does not actually acquire the memory until you use it.

A simple memcpy will be sufficient to circumvent this optimisation. (You might find that calloc still optimises out the memory allocation until the point of use.)

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 2
    Are you sure? I think if his allocation amount reaches the max of *virtual* memory available the malloc would fail, no matter what. How would malloc() know that nobody is going to use the memory?? It can't, so it must call sbrk() or whatever the equivalent in his OS is. – Peter - Reinstate Monica Oct 20 '15 at 10:33
  • 1
    I'm *pretty* sure. (malloc doesn't know but the runtime certainly would). It's trivial to test (although not easy for me right now: I'm on a train). – Bathsheba Oct 20 '15 at 10:36
  • @Bathsheba Would writing one byte to each page also suffice? Assuming `malloc` allocates on page boundaries what seems pretty likely to me. – cadaniluk Oct 20 '15 at 10:38
  • Ah, interesting. "runtime" being the Linux kernel, apparently "real" memory is only allocated upon accessing it. The sbrk() call or whatever is made alright, but to no effect except to add a a little bookkeeping entry. Nice. – Peter - Reinstate Monica Oct 20 '15 at 10:39
  • @PeterSchneider what's maximum virtual memory limit on modern 64-bit system? That whole lot of bytes, certainly more than 50G. – el.pescado - нет войне Oct 20 '15 at 10:43
  • @PeterSchneider, BTW. `sbrk` is not the only way to allocate memory. For large allocations, `mmap` may as well be used. – el.pescado - нет войне Oct 20 '15 at 10:44
  • malloc is a function that may have side effects and there is no way the compiler can know what those side effects are. There is no way the compiler can safely optimize out the calls to malloc. – doron Oct 20 '15 at 10:45
  • 2
    @doron there's no compiler involved here. It's Linux kernel behavior. – el.pescado - нет войне Oct 20 '15 at 10:53
  • And there is a pretty good write-up here: http://www.linux-mag.com/id/827/ (but from 2001) and here: http://blog.zhangxianwei.com/?p=692. "The OS, by increasing the map size, only provides a mapping between the new addresses in the process space and the memory that those pages will occupy *when they are used.*" @doron That was my first thought and objection as well (not the compiler but the standard lib); but the "optimization" if you want to call it that is, as el.pescado says, opaquely (transparently? whatever -- invisibly) happening in the kernel. – Peter - Reinstate Monica Oct 20 '15 at 10:54
  • 1
    I think glibc `calloc` takes advantage of mmap(MAP_ANONYMOUS) giving zeroed pages, so it doesn't duplicate the kernel's page-zeroing work. – Peter Cordes Oct 21 '15 at 04:03
6

Not sure about this one but the only explanation that I can things of is that linux is a copy-on-write operating system. When one calls fork the both processes point to the same physically memory. The memory is only copied once one process actually WRITES to the memory.

I think here, the actual physical memory is only allocated when one tries to write something to it. Calling sbrk or mmap may well only update the kernel's memory book-keep. The actual RAM may only be allocated when we actually try to access the memory.

doron
  • 27,972
  • 12
  • 65
  • 103
  • `fork` has nothing to do with this. You'd see the same behaviour if you booted Linux with this program as `/sbin/init`. (i.e. PID 1, the first user-mode process). You had the right general idea with copy-on-write, though: Until you dirty them, newly-allocated pages are all copy-on-write mapped to the same zeroed page. – Peter Cordes Oct 21 '15 at 04:05
  • knowing about fork allowed me to make the guess. – doron Oct 21 '15 at 17:57
0

Basic Answer

As mentioned by others, the allocation of memory, until used, does not always commit the necessary RAM. This happens if you allocate a buffer larger than one page (usually 4Kb on Linux).

One simple answer would be for your "eat memory" function to always allocate 1Kb instead of increasingly larger blocks. This is because each allocated blocks start with a header (a size for allocated blocks). So allocating a buffer of a size equal to or less than one page will always commit all of those pages.

Following Your Idea

To optimize your code as much as possible, you want to allocate blocks of memory aligned to 1 page size.

From what I can see in your code, you use 1024. I would suggest that you use:

int size;

size = getpagesize();

block_size = size - sizeof(void *) * 2;

What voodoo magic is this sizeof(void *) * 2?! When using the default memory allocation library (i.e. not SAN, fence, valgrin, ...), there is a small header just before the pointer returned by malloc() which includes a pointer to the next block and a size.

struct mem_header { void * next_block; intptr_t size; };

Now, using block_size, all your malloc() should be aligned to the page size we found earlier.

If you want to properly align everything, the first allocation needs to use an aligned allocation:

char *p = NULL;
int posix_memalign(&p, size, block_size);

Further allocations (assuming your tool only does that) can use malloc(). They will be aligned.

p = malloc(block_size);

Note: please verify that it is indeed aligned on your system... it works on mine.

As a result you can simplify your loop with:

for(;;)
{
    p = malloc(block_size);
    *p = 1;
}

Until you create a thread, the malloc() does not use mutexes. But it still has to look for a free memory block. In your case, though, it will be one after the other and there will be no holes in the allocated memory so it will be pretty fast.

Can it be faster?

Further note about how memory is generally allocated in a Unix system:

  • the malloc() function and related functions will allocate a block in your heap; which at the start is pretty small (maybe 2Mb)

  • when the existing heap is full it gets grown using the sbrk() function; as far as your process is concerned, the memory address always increases, that's what sbrk() does (contrary to MS-Windows which allocates blocks all over the place)

  • using sbrk() once and then hitting the memory every "page size" bytes would be faster than using malloc()

    char * p = malloc(size); // get current "highest address"
    
    p += size;
    p = (char*)((intptr_t)p & -size);  // clear bits (alignment)
    
    int total_mem(50 * 1024 * 1024 * 1024); // 50Gb
    void * start(sbrk(total_mem));
    
    char * end((char *)start + total_mem);
    for(; p < end; p += size)
    {
        *p = 1;
    }
    

    note that the malloc() above may give you the "wrong" start address. But your process really doesn't do much, so I think you'll always be safe. That for() loop, however, is going to be as fast as possible. As mentioned by others, you'll get the total_mem of virtual memory allocated "instantly" and then the RSS memory allocated each time you write to *p.

WARNING: Code not tested, use at your own risk.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156