How do userspace programs pass memory back to the kernel after free()?

Question

I've been reading a lot about memory allocation on the heap and how certain heap management allocators do it.

Say I have the following program:

#include<stdlib.h>
#include<stdio.h>
#include<unistd.h>

int main(int argc, char *argv[]) {
    // allocate 4 gigabytes of RAM
    void *much_mems = malloc(4294967296);
    // sleep for 10 minutes
    sleep(600);
    // free teh ram
    free(*much_mems);
    // sleep some moar
    sleep(600);

    return 0;
}

Let's say for sake of argument that my compiler doesn't optimize out anything above, that I can actually allocate 4GiB of RAM, that the malloc() call returns an actual pointer and not NULL, that size_t can hold an integer as big as 4294967296 on my given platform, that the allocater implemented by the malloc call actually does allocate that amount of RAM in the heap. Pretend that the above code does exactly what it looks like it will do.

After the call to free executes, how does the kernel know that those 4 GiB of RAM are now eligible for use for other processes and for the kernel itself? I'm not assuming the kernel is Linux, but that would be a good example. Before the call to free, this process has a heap size of at least 4GiB, and afterward, does it still have that heap size?

How do modern operating systems allow userspace programs to return memory back to kernel space? Do free implementations execute a syscall to the kernel (or many syscalls) to tell it which areas of memory are now available? And is it possible that my 4 GiB allocation will be non-contiguous?

Kaz · Answer 1 · 2016-03-31T03:17:18.557

1

On GNU/Linux with Glibc, large memory allocations, of more than a few hundred kilobytes, are handled by calling mmap. When the free function is invoked on this, the library knows that the memory was allocated this way (thanks to meta-data stored in a header). It simply calls unmap on it to release it. That's how the kernel knows; its mmap and unmap API is being used.

You can see these calls if you run strace on the program.

The kernel keeps track of all mmap-ed regions using a red-black tree. Given an arbitrary virtual address, it can quickly determine whether it lands in the mmap area, and which mapping, by performing a tree walk.

edited Mar 31 '16 at 03:17

answered Mar 31 '16 at 03:14

Kaz

55,781
9
100
149

FWIW the Windows heap manager does the equivalent with allocations over 512KB. – Robin Caron Mar 31 '16 at 03:15
@RobinCaron You mean the HeapCreate, HeapAlloc stuff? That's based on top of VirtualAlloc, I think. – Kaz Mar 31 '16 at 04:36
Yes on Windows the heap manager is implemented using VirtualAlloc/VirtualFree. The kernel mode implementations of those functions uses (or used to use) splay trees to keep track of the mapping similar to what is being done in Linux with mmap. – Robin Caron Mar 31 '16 at 04:42

score 1 · Answer 2 · answered Mar 31 '16 at 03:14

1

Do free implementations execute a syscall to the kernel (or many syscalls) to tell it which areas of memory are now available?

Yes.

A modern implementation of malloc on Linux will call mmap to allocate a large amount of memory. The kernel will find an unused virtual address, mark it as allocated, and return it. (The kernel may also return an error if there isn't enough free memory)

free would then call munmap to deallocate the memory, passing the address and size of the allocation.

On Windows, malloc will call VirtualAlloc and free will call VirtualFree.

answered Mar 31 '16 at 03:14

user253751

57,427
7
48
90

Is it possible that my allocation of 4GiB will be non-contiguous? – Naftuli Kay Mar 31 '16 at 03:16
@NaftuliTzviKay No. Unless you're talking about physical address space for some reason. But think about it - you can access the 3000000001st byte with `*(myAllocation + 3000000000)` - how would that work if it had a big gap in the middle? – user253751 Mar 31 '16 at 03:19
I might mean physical. I'm not certain how paging is implemented. Does the kernel pass me back a bunch of pages in random areas in memory which are just interpreted to be contiguous by the userspace process? – Naftuli Kay Mar 31 '16 at 03:21
@NaftuliTzviKay It would be contiguous. – Harry Mar 31 '16 at 03:28
@NaftuliTzviKay That would be an entirely separate question... the kernel does not "pass you pages", it passes you a memory address. It makes no promises about what happens when you access that memory address, except that you can store data there. Maybe it's not even in physical RAM - maybe the kernel is going to shuffle your data around between RAM and disk so that it's always in RAM when you access it! – user253751 Mar 31 '16 at 03:38
If you are asking what the kernel does then @immibis is correct, that's a separate question. Virtual memory allows you to use more memory than is physically present on the machine as if it was contiguous. See this... https://en.wikipedia.org/wiki/Virtual_memory This implies that behind the scene some magic is happening. Linux also uses lazy allocation etc. – Harry Mar 31 '16 at 03:42
There's also "compressed swap" - maybe your data *is* in RAM, but it's compressed, and the kernel will uncompress it when you try to access it. Or heck, it could be stored on another computer over a fast network. Or it might not be stored at all until you access it for the first time. Or all of the above. – user253751 Mar 31 '16 at 03:43

score 0 · Answer 3 · edited May 23 '17 at 12:19

If we're using Linux as an example it uses mmap to allocate large chunks of memory. This means when you free it it gets umapped ie the kernel gets told that it can now unmap this memory. Read up on the brk and sbrk system calls. A good place to start would be here...

What does brk( ) system call do?

and here. The following post discusses how malloc is implemented which will give you a good idea what's happening under the covers...

How is malloc() implemented internally?

Doug Lea's malloc can be found here. It's well commented and public domain...

ftp://g.oswego.edu/pub/misc/malloc.c

DevSolar · Accepted Answer · 2016-03-31T12:20:05.340

Before the call to free, this process has a heap size of at least 4GiB...

The C language does not define either "heap" or "stack". Before the call to free, this process has a chunk of 4 GB dynamically allocated memory...

and afterward, does it still have that heap size?

...and after the free(), access to that memory would be undefined behaviour, so for practical purposes, that dynamically allocated memory is no longer "there".

What the library does "under the hood" (e.g. caching, see below) is up to the library, and is subject to change without further notice. This could change with the amount of available physical memory, system load, runtime parameters, ...

How do modern operating systems allow userspace programs to return memory back to kernel space?

It's up to the standard library's implementation to decide (which, of course, has to talk to the operating system to actually, physically allocate / free memory).

Others have pointed out how certain, existing implementations do it. Other libraries, operating systems, and environments exist.

Do free implementations execute a syscall to the kernel (or many syscalls) to tell it which areas of memory are now available?

Possibly. A common optimization done by library implementations is to "cache" free()d memory, so subsequent malloc() calls can be served without talking to the kernel (which is a costly operation). When, how much, and how long memory is cached this way is, you guessed it, implementation-defined.

And is it possible that my 4 GiB allocation will be non-contiguous?

The process will always "see" contiguous memory. In a system supporting virtual memory (i.e. "modern" desktop OS's like Linux or Windows), the physical memory might be non-contiguous, but the virtual addresses your process gets to see will be contiguous (or the malloc() would have failed if this requirement could not be serviced).

Again, other systems exist. You might be looking at a system that doesn't virtualize addresses (i.e. gives physical addresses to the process). You might be looking at a system that assigns a given amount of memory to a process on startup, serves any malloc() requests from that, and doesn't support the allocation of additional memory. And so on.

score -1 · Answer 5 · answered Mar 31 '16 at 11:52

malloc() and free() are kernel functions (system calls) . it is being called by the application to allocate and free memory on the heap. application itself is not allocating/freeing memory .

the whole mechanism is executed at kernel level .

see the below heap implementation code

void *heap_alloc(uint32_t nbytes) {

    heap_header *p, *prev_p;   // used to keep track of the current unit
    unsigned int nunits;      // this is the number of "allocation units" needed by nbytes of memory

    nunits = (nbytes + sizeof(heap_header) - 1) / sizeof(heap_header) + 1;      // see how much we will need to allocate for this call

    // check to see if the list has been created yet; start it if not
    if ((prev_p = _heap_free) == NULL) {

        _heap_base.s.next = _heap_free = prev_p = &_heap_base;      // point at the base of the memory
        _heap_base.s.alloc_sz = 0;                              // and set it's allocation size to zero
    }

    // now enter a for loop to find a block fo memory
    for (p = prev_p->s.next;; prev_p = p, p = p->s.next) {

        // did we find a big enough block?
        if (p->s.alloc_sz >= nunits) {

            // the block is exact length
            if (p->s.alloc_sz == nunits)
                prev_p->s.next = p->s.next;

            // the block needs to be cut
            else {

                p->s.alloc_sz -= nunits;
                p += p->s.alloc_sz;
                p->s.alloc_sz = nunits;
            }

            _heap_free = prev_p;
            return (void *)(p + 1);
        }

        // not enough space!! Try to get more from the kernel
        if (p == _heap_free) {

            // if the kernel has no more memory, return error!
            if ((p = morecore()) == NULL)
                return NULL;
        }
    }
}

this heap_alloc function uses morecore function which is implemented as below :

heap_header *morecore() {

    char *cp;
    heap_header *up;

    cp = (char *)pmmngr_alloc_block();      // allocate more memory for the heap

    // if cp is null we have no memory left
    if (cp == NULL)
        return NULL;

    //vmmngr_mapPhysicalAddress(cp, (void *)_virt_addr);      // and map it's virtual address to it's physical address
    vmmngr_mapPhysicalAddress(vmmngr_get_directory(), _virt_addr, (uint32_t)cp, I86_PTE_PRESENT | I86_PTE_WRITABLE);
    _virt_addr += BLOCK_SIZE;      // tack on nu bytes to the virtual address; this will be our next allocation address

    up = (heap_header *)cp;
    up->s.alloc_sz = BLOCK_SIZE;
    heap_free((void *)(up + 1));

    return _heap_free;
}

as you can see this function is asking the physical memory manager to allocate a block :

cp = (char *)pmmngr_alloc_block();

and then map the allocated block into virtual memory :

vmmngr_mapPhysicalAddress(vmmngr_get_directory(), _virt_addr, (uint32_t)cp, I86_PTE_PRESENT | I86_PTE_WRITABLE);

as you can see , the whole story is being controlled by the heap manager in kernel level.

Sorry but the first line is just wrong. They may be *implemented* by *using* system calls, but they are not such in and of themselves. — DevSolar, Mar 31 '16 at 12:04

How do userspace programs pass memory back to the kernel after free()?

5 Answers5