18

I read that on Unix systems, malloc can return a non-NULL pointer even if the memory is not actually available, and trying to use the memory later on will trigger an error. Since I cannot catch such an error by checking for NULL, I wonder how useful it is to check for NULL at all?

On a related note, Herb Sutter says that handling C++ memory errors is futile, because the system will go into spasms of paging long before an exception will actually occur. Does this apply to malloc as well?

Puppy
  • 144,682
  • 38
  • 256
  • 465
fredoverflow
  • 256,549
  • 94
  • 388
  • 662
  • 3
    I think that you should not use malloc in C++ : http://stackoverflow.com/questions/184537/in-what-cases-do-i-use-malloc-vs-new – lc2817 Oct 30 '11 at 21:15
  • 1
    @lc2817 you should only use malloc if you are writing code with a C interface (i.e. functions that are to be used from C but written in C++) **and** the C code is responsible for freeing that memory. –  Oct 30 '11 at 21:17
  • @WTP thanks for this precision. Although, I don't know if it is the case here. – lc2817 Oct 30 '11 at 21:21
  • @Dror K., I do not understand the purpose of the bounty and a quick googling didn't help. The question already has an answer, does it mean you are looking for another, improved answer? – gsamaras Jan 30 '16 at 21:21
  • 2
    @gsamaras Hello, I've selected the option that indicates that an existing answer is worthy of a bounty. So the answer to your question is that I would like to reward an existing answer, and I'm not looking for a new one. – Dror K. Jan 30 '16 at 21:24

4 Answers4

35

Quoting Linux manuals:

By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the memory really is available. This is a really bad bug. In case it turns out that the system is out of memory, one or more processes will be killed by the infamous OOM killer. In case Linux is employed under circumstances where it would be less desirable to suddenly lose some randomly picked processes, and moreover the kernel version is sufficiently recent, one can switch off this overcommitting behavior using a command like:

# echo 2 > /proc/sys/vm/overcommit_memory

You ought to check for NULL return, especially on 32-bit systems, as the process address space could be exhausted far before the RAM: on 32-bit Linux for example, user processes might have usable address space of 2G - 3G as opposed to over 4G of total RAM. On 64-bit systems it might be useless to check the malloc return code, but might be considered good practice anyway, and it does make your program more portable. And, remember, dereferencing the null pointer kills your process certainly; some swapping might not hurt much compared to that.

If malloc happens to return NULL when one tries to allocate only a small amount of memory, then one must be cautious when trying to recover from the error condition as any subsequent malloc can fail too, until enough memory is available.

The default C++ operator new is often a wrapper over the same allocation mechanisms employed by malloc().

  • 8
    +1 for quoting a good rant about how the Linux default is **broken**. A good program should always check the return value of `malloc`. If the user has misconfigured their system (or left it in a broken default configuration), then of course this may not help, but there's nothing you can do and the crash is outside your responsibility. But if you fail to check the return value of `malloc`, your program will break even when running on systems where the user/admin **actually cares about correctness** and has disabled overcommit. The user will then probably consider your program crap. :-) – R.. GitHub STOP HELPING ICE Oct 30 '11 at 21:54
  • 2
    Well, the truth is a bit more complicated than that. There are holes in process address space; for example, the program might not ever touch all pages in BSS, or change a page that is mapped in the Data Segment. An undercommit is usually a bigger problem on a desktop / server system than overcommit. And the swap partition, if enabled, provides some cushion too before things get really bad. – Antti Haapala -- Слава Україні Oct 30 '11 at 22:02
  • I disagree. Undercommit is not a problem because you can always just throw more swap at it. In any case if you have untouched bss/data pages, that means you have global variables (not just GOT/PLT there) which is a bigger problem. :-) Perhaps a few are necessary, but more than a page or two worth is almost surely indicative of design problems... – R.. GitHub STOP HELPING ICE Oct 31 '11 at 01:39
  • 1
    Newbie friendly system ;) The only time I have had to deal with OOM killer would have been a runaway process that anyway brought the system to grinding halt by swapping. – Antti Haapala -- Слава Україні Oct 31 '11 at 10:18
6

On Linux, you can indeed not rely on malloc returning NULL if sufficient memory is not available due to the kernel's overallocation strategy, but you should still check for it because in some circumstances malloc will return NULL, e.g. when you ask for more memory than is available in the machine in total. The Linux malloc(3) manpage calls the overallocation "a really bad bug" and contains advice on how to turn it off.

I've never heard about this behavior also occurring in other Unix variants.

As for the "spasms of paging", that depends on the machine setup. E.g., I tend not to setup a swap partition on laptop Linux installations, since the exact behavior you fear might kill the hard disk. I would still like the C/C++ programs that I run to check malloc return values, give appropriate error messages and when possible clean up after themselves.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • 1
    Overcommit is neither a feature nor a bug, strictly speaking. It was just historical laziness: overcommit is a lot *easier* to implement than accounting for commit charge. Presumably some people got used to it and liked it (for whatever perverse reasons) and some even started writing programs that `malloc` 1gb as a sparse array and even more perverse things, so now we're stuck with it being on-by-default... – R.. GitHub STOP HELPING ICE Oct 31 '11 at 01:41
4

Checking for the return of malloc doesn't help you much by its own to make your allocations safer or less error prone. It can even be a trap if this is the only test that you implement.

When called with an argument of 0 the standard allows malloc to return a sort of unique address, which is not a null pointer and which you don't have the right to access, nevertheless. So if you just test if the return is 0 but don't test the arguments to malloc, calloc or realloc you might encounter a segfault much later.

This error condition (memory exhausted) is quite rare in "hosted" environments. Usually you are in trouble long before you hassle with this kind of error. (But if you are writing runtime libraries, are a kernel hacker or rocket builder this is different, and there the test makes perfect sense.)

People then tend to decorate their code with complicated captures of that error condition that span several lines, doing perror and stuff like that, that can have an impact on the readability of the code.

I think that this "check the return of malloc" is much overestimated, sometimes even defended quite dogmatically. Other things are much more important:

  • always initialize variables, always. for pointer variables this is crucial, let the program crash nicely before things get too bad. uninitialized pointer members in structs are an important cause of errors that are difficult to find.
  • always check the argument to malloc and Co. if this is a compile time constant like sizof toto there can't be a problem, but always ensure that your vector allocation handles the zero case properly.

An easy thing to check for return of malloc is to wrap it up with something like memset(malloc(n), 0, 1). This just writes a 0 in the first byte and crashes nicely if malloc had an error or n was 0 to start with.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • 1
    Lets just say it is much nicer to tell the user "Out of heap at line foo" than just "null pointer exception at bar"; for it a simple (macro?) wrapper for malloc would suffice. This in case one does use ridiculous amounts of memory and can expect using more than 2G on 32bit systems. – Antti Haapala -- Слава Україні Oct 31 '11 at 08:24
  • Hi Jens, with today's compilers and static analyzers, I disagree with "always initialize variables" (I don't know how bad it was in 2011). It is trivial for them to detect an uninitialized use of a variable; but it is really hard to track a bug when you use a variable that has been initialized to a meaningless default value. – alx - recommends codidact Aug 10 '22 at 20:13
  • The compiler is allowed to reorder malloc() to after some other code, and that might cause unlimited damage before the actual crash. It's likely to help, but a NULL check is simpler _and_ safer. – alx - recommends codidact Aug 10 '22 at 20:16
1

To view this from an alternative point of view:

"malloc can return a non-NULL pointer even if the memory is not actually available" does not mean that it always returns non-NULL. There might (and will) be cases where NULL is returned (as others already said), so this check is necessary nevertheless.

glglgl
  • 89,107
  • 13
  • 149
  • 217