1

So I've got an interesting OS based problem for you. I've spent the last few hours conversing with anyone I know who's experienced with C programming, and nobody seems to be able to come up with a definitive answer as to why this behaviour is occurring.

I have a program that is intentionally designed to cause an extreme memory leak, (as an example of what happens when you don't free memory after allocating it). On 64 bit operating systems, (Windows, Linux, etc), it does what it should do. It fills physical ram, then fills the swap space of the OS. In Linux, the process is then terminated by the OS. In Windows however, it is not, and it continues running. The eventual result is a system crash.

Here's the code:

#include <stdlib.h>
#include <stdio.h>

void main()
{
    while(1)
    {
        int *a;
        a = (int*)calloc(65536, 4);
    }
}

However, if you compile and run this code on a 32 bit Linux distribution, it has no effect on physical memory usage at all. It uses approximately 1% of my 4 GB of allocated RAM, and it never rises after that. I don't have a legitimate copy of 32 Bit Windows to test on, so I can't be certain this occurs on 32 bit Windows as well.

Can somebody please explain why the use of calloc will fill the physical ram of a 64 bit Linux OS, but not a 32 bit Linux OS?

root
  • 125
  • 1
  • 10
  • 1
    It's a "feature" of Linux. You can allocate as much memory as you want, *as long as you don't actually **use** it*. Try writing to a random byte in the memory you allocate, and you will see your system start terminating random processes soon enough. – Some programmer dude May 02 '18 at 16:31
  • You never use the memory. This can be catched at multiple levels. The compiler could already eliminate the code, it has no observable behavior. The OS could delay page allocations until there's a page fault. etc... To see something, fill the memory with some data other than `0`, which could be a special case. –  May 02 '18 at 16:33
  • Ok, I can modify this to use malloc and a for loop iterating over an array assigning values, which somebody else did, and it produces the same result. Secondly, that doesn't explain why this code *will* fill a 64 bit Linux OS' main memory. – root May 02 '18 at 16:35
  • 2
    The name is **overcommiting**. – Eugene Sh. May 02 '18 at 16:38
  • 1
    I don't think overcomitting is what I'm running into here either. My physical desktop PC has 16 GB of RAM. This virtual machine is using 4 GB. https://en.wikipedia.org/wiki/Memory_overcommitment – root May 02 '18 at 16:41
  • Did you even check the return value isn't `Null`? – Tony Tannous May 02 '18 at 16:43
  • Yes, I did. This program on a 32 bit OS can run for 12,084 iterations before the pointer starts becoming null. (At least on my machine, your specific number *may* vary). – root May 02 '18 at 16:46

1 Answers1

5

The malloc and calloc functions do not technically allocate memory, despite their name. They actually allocate portions of your program's address space with OS-level read/write permissions. This is a subtle difference and is not relevant most of the time.

This program, as written, only consumes address space. Eventually, calloc will start returning NULL but the program will continue running.

#include <stdlib.h>
// Note main should be int.
int main() {
    while (1) {
        // Note calloc should not be cast.
        int *a = calloc(65536, sizeof(int));
    }
}

If you write to the addresses returned from calloc, it will force the kernel to allocate memory to back those addresses.

#include <stdlib.h>
#include <string.h>
int main() {
    size_t size = 65536 * 4;
    while (1) {
        // Allocates address space.
        void *p = calloc(size, 1);
        // Forces the address space to have allocated memory behind it.
        memset(p, 0, size);
    }
}

It's not enough to write to a single location in the block returned from calloc because the granularity for allocating actual memory is 4 KiB (the page size... 4 KiB is the most common). So you can get by with just writing to each page.

What about the 64-bit case?

There is some bookkeeping overhead for allocating address space. On a 64-bit system, you get something like 40 or 48 bits of address space, of which about half can be allocated to the program, which comes to at least 8 TiB. On a 32-bit system this comes to 2 GiB or so (depending on kernel configuration).

So on a 64-bit system, you can allocate ~8 TiB, and a 32-bit system you can allocate ~2 GiB, and the overhead is what causes the problems. There is typically a small amount of overhead for each call to malloc or calloc.

See also Why malloc+memset is slower than calloc?

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • If this is true, then why will it flood the physical ram and eventually the swap space of a 64 bit OS? – root May 02 '18 at 16:45
  • @LC14199 because that's how modern memory management is done. Your allocated memory is virtual until you actually write to it, then a physical page (actual chunk of memory) is needed and the RAM usage will increase. Take an OS book and read the memory-management chapter. – Tony Tannous May 02 '18 at 16:47
  • @TonyTannous, calloc writes to the memory after it's allocated it. It initialises all memory with a value of 0. Which therefore makes the memory no longer virtual. Or at least, that's how I currently understand it anyway. – root May 02 '18 at 16:50
  • @LC14199: This is incorrect. `calloc` does not always write to the memory after allocating it. – Dietrich Epp May 02 '18 at 16:51
  • https://www.tutorialspoint.com/c_standard_library/c_function_calloc.htm By very definition on this page, "The difference in malloc and calloc is that malloc does not set the memory to zero where as calloc sets allocated memory to zero." So now I'm rather curious as to know exactly how this works. I'm testing your code right now. Then of course there's this: https://stackoverflow.com/questions/1538420/difference-between-malloc-and-calloc – root May 02 '18 at 16:54
  • @LC14199: That wording is imprecise. See https://stackoverflow.com/questions/2688466/why-mallocmemset-is-slower-than-calloc/2688522#2688522 – Dietrich Epp May 02 '18 at 16:55
  • Your code works as intended, so thank you for that. I'm reading the other article right now. – root May 02 '18 at 16:58
  • So @DietrichEpp just to clarify, this has absolutely nothing to do with pages being swapped in or out of memory due to them not being used, it's related to the fact that calloc (on its own) isn't even writing to them in the first place, hence it not consuming the physical ram. Is this understanding correct? – root May 02 '18 at 17:16
  • @LC14199: Yes. The pages are not being used, RAM is not being used, swap is not being used, except for a little bit of overhead. – Dietrich Epp May 02 '18 at 17:18