39

Until today I lived in belief that calling free() on memory space releases it for further allocation without any other modifications. Especially, considering this SO question that clearly states that free() DOESN'T zero out memory.

Yet, let's consider this piece of code (test.c):

#include<stdlib.h>
#include<stdio.h>

int main()
{
    int* pointer;

    if (NULL == (pointer = malloc(sizeof(*pointer))))
        return EXIT_FAILURE;

    *pointer = 1337;

    printf("Before free(): %p, %d\n", pointer, *pointer);

    free(pointer);

    printf("After free(): %p, %d\n", pointer, *pointer);

    return EXIT_SUCCESS;
}

Compiling (both GCC and Clang):

gcc test.c -o test_gcc
clang test.c -o test_clang

Result:

$ ./test_gcc 
Before free(): 0x719010, 1337
After free(): 0x719010, 0
$ ./test_clang
Before free: 0x19d2010, 1337
After free: 0x19d2010, 0

Why is it so? Was I living in a lie all this time or did I misunderstand some basic concepts? Or is there a better explanation?

Some technical info:

Linux 4.0.1-1-ARCH x86_64
gcc version 4.9.2 20150304 (prerelease) (GCC)
clang version 3.6.0 (tags/RELEASE_360/final)
Community
  • 1
  • 1
browning0
  • 901
  • 2
  • 11
  • 21
  • 10
    When the memory is returned to the allocation system, it may be used for any purpose the system likes. It might store control information in the memory space, modifying what was returned. There are no constraints on the allocators; they are neither required to modify nor required to leave unchanged the memory that was returned to them. Any access to freed memory is invalid. – Jonathan Leffler Jun 06 '15 at 13:50
  • 8
    For what it is worth, you are actually testing the same thing because `free` is part of the C library and both `gcc` and `clang` use the `glibc` on your system. Try allocating a huge chunk of memory instead of 8 bytes, say 16 MB and see if dereferencing the freed memory crashes. – chqrlie Jun 06 '15 at 14:03
  • 4
    It is entirely possible that the reason that you are seeing this specific behavior has to do with the metadata management of the dynamic memory library. Many use the first few bytes of unallocated chunks to track size, in use and pointers fore and aft. It is possible that in the process of releasing it has modified the data in such a way to create this behavior as a side effect since you have no business dereferencing the memory after you free it. :) – David Hoelzer Jun 06 '15 at 14:32
  • But how can this possibly matter??? Memory freed by `free` is generally not accessible. In any case, even if it is physically acessible, it is not *meaningfully* accessible. Why would anyone care? Moreover, a portion of the space in a free block is typically used for internal purposes of dynamic memory management. Which means that it will not be totally zeroed out. – AnT stands with Russia Jun 06 '15 at 15:09
  • In your pecific example you are most likely hitting exactly the area "borrowed" by implementation for internal heap management purposes, whilch is whay the value you observe changes. – AnT stands with Russia Jun 06 '15 at 15:18
  • @AnT I believe it could be useful in debugging process. Consider scenario: filling memory with garbage before freeing it. Then, if one is trying to read from freed memory (assuming it would not result in segfault) he or she'll receive meaningless data, indicating a bug. – browning0 Jun 06 '15 at 15:19
  • 3
    @browning0: Well, as I stated in my answer, yes, this is what debug implementations typically do. But that only applies to debug implementations. And the *beginning* of a freed block is typically used for completely different household purposes. BTW, in your example, you are inspecing the beginning of the block specifically, which is not a good indication of what happens to the rest of the block. – AnT stands with Russia Jun 06 '15 at 15:54
  • 1
    @browning0 The environment variable [MALLOC_PERTURB_](http://udrepper.livejournal.com/11429.html) used by glibc can be used to fill the freed memory region with a pattern. – 4566976 Jun 06 '15 at 15:55
  • 2
    Also note that if after calling free, your allocator decides to drop virtual pages, when it maps them back again at a later time the kernel (in modern systems) will have wiped them clean upon faulting (either zero'ed out or randomized) because it is a security failure to read another process's discarded memory pages. So there's really a lot going on, for all intents and purposes the contents of a memory buffer goes indeterminate after freeing it. – Thomas Jun 07 '15 at 02:40

7 Answers7

27

There's no single definitive answer to your question.

  • Firstly, the external behavior of a freed block will depend on whether it was released to the system or stored as a free block in the internal memory pool of the process or C runtime library. In modern OSes the memory "returned to the system" will become inaccessible to your program, which means that the question of whether it was zeroed-out or not is moot.

(The rest applies to the blocks retained in the internal memory pool.)

  • Secondly, there's little sense in filling freed memory with any specific value (since you are not supposed to access it), while the performance cost of such operation might be considerable. Which is why most implementations don't do anything to freed memory.

  • Thirdly, at debugging stage filling freed memory with some pre-determined garbage value can be useful in catching errors (like access to already freed memory), which is why many debug implementations of standard library will fill freed memory with some pre-determined value or pattern. (Zero, BTW, is not the best choice for such value. Something like 0xDEADBABE pattern makes a lot more sense.) But again, this is only done in debug versions of the library, where performance impact is not an issue.

  • Fourthly, many (most) popular implementations of heap memory management will use a portion of the freed block for its internal purposes, i.e. store some meaningful values there. Which means that that area of the block is modified by free. But generally it is not "zeroed".

And all this is, of course, heavily implementation-dependent.

In general, your original belief is perfectly correct: in the release version of the code a freed memory block is not subjected to any block-wide modifications.

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • This answer is correct, and if 'free' will zero or not is undefined. For example, on windows 'malloc' and 'free' in many compiler runtimes have been, and sometimes still are, implemented directly with Windows API Heap functions which will zero returned user memory as part of OS policy, and many debug runtimes will clear, fill, or otherwise mark freed memory for testing for bad code. – Beeeaaar Jun 07 '15 at 00:19
  • 10
    `0xDEADBEEF` is more common. `0xDEADBABE` is not welcoming to females of the industry. – Lightness Races in Orbit Jun 07 '15 at 02:10
  • @browning0: While I agree that those memory filling debug techniques do not always help, I actually have found it more likely to be helpful than you have. Not only are the reasonably obvious when you see them in the debugger window, but by touching every word of memory in the block, it also makes it trivial to put a watchpoint which stops exactly when that block of memory gets freed. I have found the ability to do this is remarkably effective at pinpointing the error, rather than only getting to see the first time my code clobbers the data with an invalid pointer. – Cort Ammon Jun 07 '15 at 17:10
  • @Celess: When we look at this from the C standard library end, heap memory management is usually quite multi-layered. Which is why it is not possible to describe the behavior in terms of a single API, like an OS policy. – AnT stands with Russia Jun 07 '15 at 17:19
  • @AnT I'm not sure I was trying to describe zero behavior as a whole in the terms of any single environment, if that is what you meant. I was instead pointing out that as a practical matter the behavior was undefined and in fact could easily end up being zeroed out, and was saying this in support of the above answer. I'm pretty sure the result is undefined in any of the specs as well. Also, not all runtime implementations are very "multilayered" and many are just a thin shim for the underlying OS API which I was using as an example. – Beeeaaar Jun 08 '15 at 05:24
18

free() does not zero memory as a general rule. It simply releases it for re-used by a future call to malloc(). Certain implementations may fill the memory with known values but that is purely an implementation detail of the library.

Microsoft's runtime makes good use of marking freed and allocated memory with useful values (see In Visual Studio C++, what are the memory allocation representations? for more information). I have also seen it filled with values that when executed would cause a well-defined trap.

Community
  • 1
  • 1
D.Shawley
  • 58,213
  • 10
  • 98
  • 113
16

is there a better explanation?

There is. Dereferencing a pointer after it has been free()d results in undefined behavior, so the implementation has the permission to do anything it pleases, including the act of tricking you into believing that the memory region has been filled with zeroes.

  • Not stricto sensu only the compiler, but the implementation (including the compiler and the standard C library) – Basile Starynkevitch Jun 06 '15 at 13:56
  • Is it a documented behaviour of GCC or Clang? Can't find anything in documentations (don't even know what should I search for). What if, for debugging purpose, I want to fill memory with garbage before freeing it? References to free()d memory, filled with garbage, could be tracked more easily that way. – browning0 Jun 06 '15 at 13:58
  • @BasileStarynkevitch you are right, changed words to reflect this fact. – The Paramagnetic Croissant Jun 06 '15 at 14:00
  • 1
    @browning0: you mustn't reference free(d) memory. Undefined behaviour means the program can stop with a seg fault. – chqrlie Jun 06 '15 at 14:00
  • 2
    @browning0 You may want to use such tools as [Valgrind](http://valgrind.org/) or [AddressSanitizer](https://en.wikipedia.org/wiki/AddressSanitizer). However those are not capable of detecting all possible corruptions. If you are lucky your program will segfault - if you are not it will turn into hard to track [Haisenbugs](https://en.wikipedia.org/wiki/Heisenbug). – Maciej Piechotka Jun 06 '15 at 20:37
  • @Michael Kjörling: I agree anything is possible. But statistically speaking, odds are high nothing will happen, there is a good chance for a segmentation fault, especially if the freed block was large, there is a very small chance it would invoke a tax audit and a vanishingly small chance one could avoid the latter via time travel. – chqrlie Jun 07 '15 at 13:49
  • @chqrlie you would be surprised how some popular aggressively-optimizing compilers do really unintuitive things with your UB-containing code… – The Paramagnetic Croissant Jun 07 '15 at 18:02
9

There is another pitfall you might have not known actually, here:

free(pointer);

printf("After free(): %p \n", pointer); 

Even just reading the value of pointer after you free it is undefined behaviour, because the pointer becomes indeterminate.

Of course dereferencing the freed pointer - like in below example - is also not allowed:

free(pointer);

printf("After free(): %p, %d\n", pointer, *pointer);

ps. In general when printing address with %p (like in printf) cast it to (void*), e.g. (void*)pointer - otherwise you get undefined behaviour also

Community
  • 1
  • 1
Giorgi Moniava
  • 27,046
  • 9
  • 53
  • 90
  • 2
    Reading a pointer value is NOT undefined behavior, only dereferencing is. `free` is just a function, the C language does not provide the capability to change parameters passed by value. – harper Jun 06 '15 at 17:00
  • @harper: It is undefined please see the answer in the question I linked. – Giorgi Moniava Jun 06 '15 at 21:02
  • 3
    @harper: Even though the value of the pointer doesn't change, `free()` changes that value from being valid to being indeterminate. – Keith Thompson Jun 06 '15 at 21:31
  • 1
    It's worth mentioning that `printf("%p\n", pointer);` has undefined behavior, since `%p` requires an argument of type `void*`, and `pointer` is of type `int*`. That's why the cast to `void*` is recommended. (In most implementations, `void*` and `int*` have the same representation and are passed as arguments in the same way, but that's not guaranteed.) – Keith Thompson Jun 06 '15 at 21:33
  • @KeithThompson: Indeed I have mentioned that in the end of answer, too – Giorgi Moniava Jun 06 '15 at 21:45
  • 1
    You suggested the cast. You didn't mention that omitting it causes undefined behavior. – Keith Thompson Jun 06 '15 at 21:53
  • @KeithThompson: didn't want to overwhelm the post, I didn't mention that also with dereferencing. Will put small update now – Giorgi Moniava Jun 06 '15 at 21:58
  • @KeithThompson If free() is changing the value, are all bits of the value affected? Probably you want to give the same value a different meaning nut you can still compare it with NULL or any other pointer value to make any decision. If that is useful depends on the logic in the function that calls free(). You could even wrap free() with a function foo(), does calling foo() change the value parameter? – harper Jun 09 '15 at 07:25
  • @harper: I didn't say `free()` changes the value. What I said was that the same value that's valid before the `free()` becomes indeterminate after the `free()`. This: `free(ptr); if (ptr == NULL) ...` has undefined behavior. – Keith Thompson Jun 09 '15 at 07:35
  • @KeithThomsom changes that value from – harper Jun 09 '15 at 07:37
  • @KeithThompson: Given `union { void *p; char b[sizeof (void*)]; } u;`, can a call to `free(u.p);` affect the contents of `u.b[]`? The effect of using `u.p` as an rvalue may be arbitrarily changed by `free(u.p)`, but I don't think the bitwise representation can be. Is my understanding correct? – supercat Jun 24 '15 at 20:56
  • @supercat: Your understanding matches mine. I'm not sure that the C committee agrees with us. There was a controversial DR (Defect Report) that said something about allowing bits to change behind the scenes in some cases; the counterargument is that the elements of `b` are objects that retain their last-stored value (N1570 6.2.4p2). I don't have the details of the DR handy. – Keith Thompson Jun 24 '15 at 23:11
  • @supercat: Ah, here it is: [DR #260](http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm), submitted by Clive D.W. Feather in 2001. – Keith Thompson Jun 24 '15 at 23:17
  • @KeithThompson: The issues in that defect report go even deeper than I'd realized, though the printf/scanf program is like one I've posted elsewhere (wonder if that's coincidence, or my posting helped spread the idea)? If one were to extend the "C virtual machine" to say that each storage location of type `unsigned char` may *either* hold a number 0-UCHAR_MAX, or a Magical Mystical Entity which is implicitly but non-necessarily-reversibly convertible to a value from 0..UCHAR_MAX, and specify that assembling a pointer from `unsigned char` values will only work if... – supercat Jun 26 '15 at 17:07
  • ...the values used are MMEs obtained from another pointer, then it would be possible (if not practical) to design a 100% standards-conforming garbage collector by e.g. making the backing storage for each `char` be large enough to hold either an address or a value in the range 0..UCHAR_MAX *and distinguish which it holds*. Even better, though, might be to recognize that the kinds of embedded and systems programming *for which C was designed in the first place* have different semantic requirements than e.g. large-scale numerical applications programming, and to... – supercat Jun 26 '15 at 17:15
  • ...recognize that no single language will be able to meet the conflicting requirements optimally *unless it provides a means of specifying what level of semantic guarantees a given program or portion thereof will need*. – supercat Jun 26 '15 at 17:27
  • Actually the `free(x)` setting `x` to indeterminate value is beneficial - it allows one to use transparent garbage collection - let `free(x)` just set `x` to null... – Antti Haapala -- Слава Україні Aug 15 '17 at 17:22
9

Is free() zeroing out memory?

No. The glibc malloc implementation may overwrite up to four times the size of a pointer of the former user data for internal housekeeping data.

The details:

The following is the malloc_chunk structure of glibc (see here):

struct malloc_chunk {

  INTERNAL_SIZE_T      prev_size;  /* Size of previous chunk (if free).  */
  INTERNAL_SIZE_T      size;       /* Size in bytes, including overhead. */

  struct malloc_chunk* fd;         /* double links -- used only if free. */
  struct malloc_chunk* bk;

  /* Only used for large blocks: pointer to next larger size.  */
  struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
  struct malloc_chunk* bk_nextsize;
};

The memory region for user data in an allocated memory chunk begins after the size entry. After free is called the memory space where the user data has been may be used for lists of free memory chunks, so the first 4 * sizeof(struct malloc_chunk *) bytes of the former user data are probably overwritten, hence another value than the former user data value is printed out. It is undefined behaviour. If the allocated block is larger there could be a segmentation fault.

4566976
  • 2,419
  • 1
  • 10
  • 14
  • I don't see how this answers the question. – edmz Jun 06 '15 at 14:16
  • @black, `fd`, `bk`, `fd_nextsize`, and `bk_nextsize` are used for maintaining the free list, and are only needed if a memory block hasn't been assigned to the program for use. To increase memory efficiency, when you call `malloc()`, the pointer returned is the address of `fd` -- the user-usable part of the memory block overlaps with the memory control structure. When you call `free()`, the first few bytes of the memory block get overwritten with bookkeeping data, and in the specific case of the question, this gives the appearance of zeroing the memory. – Mark Jun 06 '15 at 20:01
5

As others pointed out, you are not allowed to do anything with a freed pointer (else that is the dreaded undefined behavior, which you should always avoid, see this).

In practice, I recommend never coding simply

free(ptr);

but always coding

free(ptr), ptr=NULL;

(since practically speaking this helps a lot to catch some bugs, except double frees)

If ptr is not used after that, the compiler would optimize by skipping the assignment from NULL

In practice, the compiler knows about free and malloc (because the standard C libraries headers would probably declare these standard functions with appropriate function attributes -understood by both GCC & Clang/LLVM) so might be able to optimize the code (according to the standard specification of malloc & free....), but the implementation of malloc and free is often provided by your C standard library (e.g. very often GNU glibc or musl-libc on Linux) so the actual behavior is provided by your libc (not the compiler itself). Read appropriate documentation, notably free(3) man page.

BTW, on Linux, both glibc and musl-libc are free software, so you might study their source code to understand their behavior. They would sometimes obtain virtual memory from the kernel using a system call like mmap(2) (and later release back the memory to the kernel using munmap(2)), but they generally try to reuse previously freed memory for future mallocs

In practice, free could munmap your memory (notably for big memory malloc-ated zones) - and then you'll get a SIGSEGV if you dare dereferencing (later) that freed pointer, but often (notably for small memory zones) it would simply manage to reuse that zone later. The exact behavior is implementation specific. Usually free does not clear or write the just freed zone.

You are even allowed to redefine (i.e. re-implement) your own malloc and free, perhaps by linking a special library such as libtcmalloc, provided your implementation has a behavior compatible with what the C99 or C11 standard says.

On Linux, disable memory overcommit and use valgrind. Compile with gcc -Wall -Wextra (and probably -g when debugging; you might consider also passing -fsanitize=address to recent gcc or clang at least to hunt some naughty bugs.).

BTW, sometimes Boehm's conservative garbage collector might be useful; you'll use (in your whole program) GC_MALLOC instead of malloc and you won't care about free-ing memory.

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • " the actual behavior is provided by your libc" not entirely true because, when `printf`ing after `free`, the compiler is allowed to treat this UB as it will. – edmz Jun 06 '15 at 14:18
  • This is what I told when saying that the compiler might be able to optimize the code – Basile Starynkevitch Jun 06 '15 at 14:22
  • I wasn't referring to optimizations; rather, it could -although weird- assign `0` to that location. – edmz Jun 06 '15 at 14:35
  • @BasileStarynkevitch Wouldn't setting pointer to NULL after every free() call hide double free errors, which could reveal some logic inconsistency of application? So one can both gain and loose on setting pointer to NULL, am I right? – browning0 Jun 06 '15 at 15:12
  • 1
    @browning0: if you free pointer which has been assigned NULL, this is defined behaviour. – Giorgi Moniava Jun 06 '15 at 15:17
  • @giorgi: Yes, that's right, but it's not an answer for my questions. Double freeing memory error may expose some bigger flaw in program. Setting pointer to NULL after each free() prevents double free errors, therefore hiding some useful debugging info, isn't that right? – browning0 Jun 06 '15 at 15:32
  • @browning0: I believe here http://stackoverflow.com/questions/1025589/setting-variable-to-null-after-free, you may find something relevant – Giorgi Moniava Jun 06 '15 at 18:34
2

free() can actually return memory to the operating system and make the process smaller. Usually, all it can do is allow a later call to malloc to reuse the space. In the meantime, the space remains in your program as part of a free-list used internally by malloc.

Amol Saindane
  • 1,568
  • 10
  • 19