72

I'm trying to figure out what would happened if I try to free a pointer "from the middle" for example, look at the following code:

char *ptr = (char*)malloc(10*sizeof(char));

for (char i=0 ; i<10 ; ++i)
{
    ptr[i] = i+10;
}
++ptr;
++ptr;
++ptr;
++ptr;
free(ptr);

I get a crash with an Unhandled exception error msg. I want to understand why and how free works so that I know not only how to use it but also be able to understand weird errors and exceptions and better debug my codeץ

Thanks a lot

Vijay Mathew
  • 26,737
  • 4
  • 62
  • 93
user238082
  • 723
  • 1
  • 6
  • 7
  • 1
    There is no singular "How does it work" because it's implementation specific. – GManNickG Dec 24 '09 at 07:00
  • 9
    Careful, @GMan, there's a real difference between implementation-defined (meaning the implementation must document it and act in accordance with that) and undefined (which means anything can happen, up to and including monkeys flying out of your butt). :-) – paxdiablo Dec 24 '09 at 07:05
  • 1
    I meant "How does free() work", not "What does my code do?" I was answering the title question. – GManNickG Dec 24 '09 at 07:16
  • Apologies, misunderstood the response. – paxdiablo Dec 24 '09 at 07:28
  • 1
    Perhaps you'd get the people with the incessant UB questions to listen better if you mentioned that the monkeys could **fly in** instead of just flying out.. ;-) – R.. GitHub STOP HELPING ICE Feb 25 '11 at 18:59
  • 1
    possible duplicate of [How do malloc() and free() work?](http://stackoverflow.com/questions/1119134/how-do-malloc-and-free-work) – fredoverflow Sep 29 '12 at 08:25
  • Possible duplicate of [How do malloc() and free() work?](https://stackoverflow.com/questions/1119134/how-do-malloc-and-free-work) – S.S. Anne Oct 20 '19 at 22:04

8 Answers8

127

When you malloc a block, it actually allocates a bit more memory than you asked for. This extra memory is used to store information such as the size of the allocated block, and a link to the next free/used block in a chain of blocks, and sometimes some "guard data" that helps the system to detect if you write past the end of your allocated block. Also, most allocators will round up the total size and/or the start of your part of the memory to a multiple of bytes (e.g. on a 64-bit system it may align the data to a multiple of 64 bits (8 bytes) as accessing data from non-aligned addresses can be more difficult and inefficient for the processor/bus), so you may also end up with some "padding" (unused bytes).

When you free your pointer, it uses that address to find the special information it added to the beginning (usually) of your allocated block. If you pass in a different address, it will access memory that contains garbage, and hence its behaviour is undefined (but most frequently will result in a crash)

Later, if you free() the block but don't "forget" your pointer, you may accidentally try to access data through that pointer in the future, and the behaviour is undefined. Any of the following situations might occur:

  • the memory might be put in a list of free blocks, so when you access it, it still happens to contain the data you left there, and your code runs normally.
  • the memory allocator may have given (part of) the memory to another part of your program, and that will presumably have then overwritten (some of) your old data, so when you read it, you'll get garbage which might cause unexpected behaviour or crashes from your code. Or you will write over the other data, causing the other part of your program to behave strangely at some point in the future.
  • the memory could have been returned to the operating system (a "page" of memory that you're no longer using can be removed from your address space, so there is no longer any memory available at that address - essentially an unused "hole" in your application's memory). When your application tries to access the data a hard memory fault will occur and kill your process.

This is why it is important to make sure you don't use a pointer after freeing the memory it points at - the best practice for this is to set the pointer to NULL after freeing the memory, because you can easily test for NULL, and attempting to access memory via a NULL pointer will cause a bad but consistent behaviour, which is much easier to debug.

Jason Williams
  • 56,972
  • 11
  • 108
  • 137
  • Nice explanation, however it still doesn't explain how free() actually works. You are essentially saying nothing more than "The C library function void free(void *ptr) deallocates the memory previously allocated by a call to calloc, malloc, or realloc." – Tom Charles Zhang Jan 27 '23 at 03:57
  • For instance, I think it's worth mentioning inside operating system there is some kind of "memory management unit" that will keep track of allocated and free memory blocks. When certain blocks are deallocated by the host program, the operating system is free to allocate such memory blocks to other programs. – Tom Charles Zhang Jan 27 '23 at 04:04
  • The part with "essentially an unused "hole" in your application's memory" is also a bit misleading/ambiguous in meaning without fully explaining what is memory space (or what it looks like) for a particular process. – Tom Charles Zhang Jan 27 '23 at 04:07
  • 1
    @TomCharlesZhang: My answer is generalised, because the way that alloc/free interact with the OS is an implementation detail that is specific to the combination of C++ variant, host OS, and hardware (CPU/memory) architecture that the code is targeting. – Jason Williams Jan 29 '23 at 23:12
29

You probably know that you are supposed to pass back exactly the pointer you received.

Because free() does not at first know how big your block is, it needs auxiliary information in order to identify the original block from its address and then return it to a free list. It will also try to merge small freed blocks with neighbors in order to produce a more valuable large free block.

Ultimately, the allocator must have metadata about your block, at a minimum it will need to have stored the length somewhere.

I will describe three ways to do this.

  • One obvious place would be to store it just before the returned pointer. It could allocate a block that is a few bytes larger than requested, store the size in the first word, then return to you a pointer to the second word.

  • Another way would be to keep a separate map describing at least the length of allocated blocks, using the address as a key.

  • An implementation could derive some information from the address and some from a map. The 4.3BSD kernel allocator (called, I think, the "McKusick-Karel allocator") makes power-of-two allocations for objects of less than page size and keeps only a per-page size, making all allocations from a given page of a single size.

It would be possible with some types of the second and probably any kind of the third type of allocator to actually detect that you have advanced the pointer and DTRT, although I doubt if any implementation would burn the runtime to do so.

DigitalRoss
  • 143,651
  • 25
  • 248
  • 329
15

Most (if not all) implementation will lookup the amount of data to free a few bytes before the actual pointer you are manipulating. Doing a wild free will lead to memory map corruption.

If your example, when you allocate 10 bytes of memory, the system actually reserve, let's say, 14. The first 4 contains the amount of data you requested (10) and then the return value of the malloc is a pointer to the first byte of unused data in the 14 allocated.

When you call free on this pointer, the system will lookup 4 bytes backwards to know that it originally allocated 14 bytes so that it knows how much to free. This system prevents you from providing the amount of data to free as an extra parameter to free itself.

Of course, other implementation of malloc/free can choose other way to achieve this. But they generally don't support to free on a different pointer than what was returned by malloc or equivalent function.

Zeograd
  • 1,453
  • 13
  • 13
  • Supposed I have char s[3] = {a,b,c}. Why s == 'a' ?? – onmyway133 Feb 20 '13 at 11:47
  • 1
    in this particular case, there isn't any dynamic allocation involved. The compiler is allocating the 3 needed bytes on the stack and not on the heap. You don't have to (and shouldn't !) call free(s) – Zeograd Mar 14 '13 at 21:21
  • you say "the return value of the malloc is a pointer to the first byte of unused data in the 14 allocated", but then you say "lookup 4 bytes backward" !!?? And, is it documented somewhere ? – onmyway133 Mar 15 '13 at 02:59
  • 1
    This information depends on the malloc implementation you use and the documentation is generally only found as comment in the source code. For instance, in the GNU libc implementation, you can find this comment : Minimum overhead per allocated chunk: 4 or 8 bytes Each malloced chunk has a hidden word of overhead holding size and status information. – Zeograd Mar 20 '13 at 11:05
  • @onmyway133, also, s is a pointer to the first array element, it can be equal to 'a' character only by accident. – Eugene Shatsky Mar 21 '19 at 14:39
8

From http://opengroup.org/onlinepubs/007908775/xsh/free.html

The free() function causes the space pointed to by ptr to be deallocated; that is, made available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not match a pointer earlier returned by the calloc(), malloc(), realloc() or valloc() function, or if the space is deallocated by a call to free() or realloc(), the behaviour is undefined. Any use of a pointer that refers to freed space causes undefined behaviour.

PetrosB
  • 4,134
  • 5
  • 22
  • 21
  • A link with no explanation isn't really an answer. – GManNickG Dec 24 '09 at 07:00
  • 1
    Why!? I've seen many times just a link being the accepted answer! – PetrosB Dec 24 '09 at 07:20
  • 8
    The problems with links, @Petros, and others may disagree with me (good chance seeing that there's 120,000-odd of us), is that they may disappear (yes, even things like Wikipedia). I don't mind links themselves but there should be enough meat in the answer so that, even if the rest of the internet was destroyed, SO could still be useful. What I tend to do is explain enough to answer the question then put in any links for those that want to go further. – paxdiablo Dec 24 '09 at 07:32
  • Realistically speaking, I don't think that Open Group's site will go anywhere. Also, the answer was edited and a self-explanatory quoted text which could be the answer to the OP's question was added. – PetrosB Dec 24 '09 at 08:40
7

That's undefined behaviour - don't do it. Only free() pointers obtained from malloc(), never adjust them prior to that.

The problem is free() must be very fast, so it doesn't try to find the allocation your adjusted address belongs to, but instead tries to return the block at exactly the adjusted address to the heap. That leads to undefined behaviour - usually heap corruption or crashing the program.

sharptooth
  • 167,383
  • 100
  • 513
  • 979
  • 1
    I would not classify this as just an issue of being fast. Without extensive bookkeeping information that could also cost a lot in terms of memory or impose a particular[ly bad] design, finding the start of an allocated block given a random pointer inside it is simply not possible. – R.. GitHub STOP HELPING ICE Feb 25 '11 at 19:04
  • @R.. 'inding the start of an allocated block given a random pointer inside it is simply not possible.' I do not think so.. – Koray Tugay Jun 04 '15 at 11:31
6

You're freeing the wrong address. By changing the value of ptr, you change the address. free has no way of knowing that it should try to free a block starting 4 bytes back. Keep the original pointer intact and free that instead of the manipulated one. As others pointed out, the results of doing what you're doing are "undefined"... hence the unhandled exception.

Jason D
  • 2,303
  • 14
  • 24
3

Never do this.

You're freeing the wrong address. By changing the value of ptr, you change the address. free has no way of knowing that it should try to free a block starting 4 bytes back. Keep the original pointer intact and free that instead of the manipulated one. As others pointed out, the results of doing what you're doing are "undefined"... hence the unhandled exception

Jeet
  • 31
  • 1
2

Taken from the book: Understanding and Using C Pointers

When memory is allocated, additional information is stored as part of a data structure maintained by the heap manager. This information includes, among other things, the block’s size, and is typically placed immediately adjacent to the allocated block.

Koray Tugay
  • 22,894
  • 45
  • 188
  • 319