free() returns memory to the OS

Question

My test code shows that after free() and before the program exits, the heap memory is returned to the OS. I use htop(same for top) to observe the behaviour. My glibc version is ldd (Ubuntu GLIBC 2.31-0ubuntu9.9) 2.31 .

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define BUFSIZE 10737418240 

int main(){
    printf("start\n");
    u_int32_t* p = (u_int32_t*)malloc(BUFSIZE);
    if (p == NULL){
        printf("alloc 10GB failed\n");
        exit(1);
    }
    memset(p, 0, BUFSIZ);
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        p[i] = 10;
    }
    printf("before free\n");
    free(p);
    sleep(1000);
    printf("exit\n");
}

Why this question Why does the free() function not return memory to the operating system? observes an opposite behaviour compared to mine? The OP also uses linux and the question is asked in 2018. Do I miss something?

Whatever you may or may not have observed, you haven't posted the evidence here. — user207421, Aug 28 '22 at 11:56
@user207421 run `htop` on a terminal and run the program on another, then one could see the memory usage increase and then decrease. Stackoverflow don't like screenshots. — Rick, Aug 28 '22 at 11:58
So you say, but again where is the evidence? What memory usage did you observe before, during, and after? And how can you be sure you observed anything between `free()` and `exit()` when there is only 1 second between them? — user207421, Aug 28 '22 at 12:01
@user207421 My mem usage is `2gb` and after `malloc` it increases to ~`13gb` and decrease to ~`2gb` after `free()`. My mem usage doesn't matter. I've written down steps on how to reproduce it. People could adjust the `BUFSIZE` to reproduce it. Why my numbers even matters? — Rick, Aug 28 '22 at 12:06
Try adding a small 2nd allocation after the big one (which you don't free), and see if anything changes. Also use for example `strace` to observe what calls to the operating system your program actually does. — hyde, Aug 28 '22 at 12:09
I suggest it decreases to 2GB after `exit()`. You have produced no evidence to the contrary. You could try a much longer sleep, or a read from `stdin`, to be *sure* that you get a measurement between `free()` and `exit()`. You should not expect people to execute arbitrary code without a better reason than you've provided so far. — user207421, Aug 28 '22 at 12:11
@user207421 I think `sleep(1000)` means sleep for `1000` seconds? — Rick, Aug 28 '22 at 12:13
I expect that the difference is in the size of the allocation. If you allocate a megabyte of memory instead of 10 gigabytes, you may not see the usage drop. — Jonathan Leffler, Aug 28 '22 at 12:16
Note well that it is entirely an implementation detail of your C implementation whether and under what conditions `free`d memory is released back to the OS before the program exits. Generally speaking, you cannot rely on it being released, and you need to be aware of that. But isn't it nice that under some circumstances it actually is released? — John Bollinger, Aug 28 '22 at 13:53
@JonathanLeffler Yes you right. But why nobody gave some code sample and at last I did it myself . :P — Rick, Sep 14 '22 at 15:37
@JohnBollinger But it would be interesting to know how linux acts, which is my question. — Rick, Sep 14 '22 at 15:38

John Zwinck · Answer 1 · 2022-08-28T13:43:16.650

Linux treats allocations larger than MMAP_THRESHOLD differently. See Why does malloc rely on mmap starting from a certain threshold?

The question you linked, where allocations may not appear to be fully reclaimed immediately, uses small allocations which are sort of pooled together by malloc() and not instantly returned to the OS on each small deallocation (that would be slow). Your single huge allocation definitely goes via the mmap() path, and so is a totally independent allocation which will be fully and immediately reclaimed.

Think of it this way: if you ask someone to buy you eggs and milk, they will likely make a single trip and return with what you requested. But if you ask for eggs and a diamond ring, they will treat those as two totally separate requests, fulfilled using very different strategies. If you then say you no longer need the eggs and the ring, they may keep the eggs for when they get hungry, but they'll probably try to get their money back for the ring right away.

score 0 · Accepted Answer · answered Sep 14 '22 at 15:34

I did some experiments, read a chapter of The Linux Programming Interface and get an satisfying answer for myself.

First , the conclusion I have is:

Library call malloc uses system calls brk and mmap under the hood when allocating memory.
As @John Zwinck describs, a linux process would choose to use brk or mmap allocating mem depending on how much you request.
If allocating by brk, the process is probably not returning the memory to the OS before it terminates (sometimes it does). If by mmap, for my simple test the process returns the mem to OS before it terminates.

Experiment code (examine memory stats in htop at the same time):

code sample 1

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdint.h>

#define BUFSIZE 1073741824 //1GiB

// run `ulimit -s unlimited` first

int main(){
    printf("start\n");
    printf("%lu \n", sizeof(uint32_t));
    uint32_t* p_arr[BUFSIZE / 4]; 
    sleep(10); 
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        uint32_t* p = (uint32_t*)malloc(sizeof(uint32_t));
        if (p == NULL){
            printf("alloc failed\n");
            exit(1);
        }
        p_arr[i] = p;
    } 
    printf("alloc done\n"); 
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        free(p_arr[i]);
    }
    
    printf("free done\n");
    sleep(20);
    printf("exit\n");
}

When it comes to "free done\n", and sleep(), you can see that the program still takes up the memory and doesn't return to the OS. And strace ./a.out showing brk gets called many times.

Note:

I am looping malloc to allocate memory. I expected it to take up only 1GiB ram but in fact it takes up 8GiB ram in total. malloc adds some extra bytes for bookeeping or whatever else. One should never allocate 1GiB in this way, in a loop like this.

code sample 2:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdint.h>

#define BUFSIZE 1073741824 //1GiB

int main(){
    printf("start\n");
    printf("%lu \n", sizeof(uint32_t));
    uint32_t* p_arr[BUFSIZE / 4]; 
    sleep(3); 
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        uint32_t* p = (uint32_t*)malloc(sizeof(uint32_t));
        if (p == NULL){
            printf("alloc failed\n");
            exit(1);
        }
        p_arr[i] = p;
    } 
    printf("%p\n", p_arr[0]);
    printf("alloc done\n"); 
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        free(p_arr[i]);
    }
    printf("free done\n");
    printf("allocate again\n");
    sleep(10);
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        uint32_t* p = malloc(sizeof(uint32_t));
        if (p == NULL){
            PFATAL("alloc failed\n");
        }
        p_arr[i] = p;
    } 
    printf("allocate again done\n");
    sleep(10);
    for(size_t i = 0; i < (BUFSIZE / 4); i++){
        free(p_arr[i]);
    }
    printf("%p\n", p_arr[0]);
    sleep(3);
    printf("exit\n");
}

This one is similar to sample 1, but it allocate again after free. The scecond allocation doesn't increase memory usage, it uses the freed yet not returned mem again.

code sample 3:

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

#define MAX_ALLOCS 1000000

int main(int argc, char* argv[]){
    int freeStep, freeMin, freeMax, blockSize, numAllocs, j;
    char* ptr[MAX_ALLOCS];
    printf("\n");
    numAllocs = atoi(argv[1]);
    blockSize = atoi(argv[2]);
    freeStep = (argc > 3) ? atoi(argv[3]) : 1;
    freeMin = (argc > 4) ? atoi(argv[4]) : 1;
    freeMax = (argc > 5) ? atoi(argv[5]) : numAllocs;
    assert(freeMax <= numAllocs);

    printf("Initial program break:   %10p\n", sbrk(0));
    printf("Allocating %d*%d bytes\n", numAllocs, blockSize);
    for(j = 0; j < numAllocs; j++){
        ptr[j] = malloc(blockSize);
        if(ptr[j] == NULL){
            perror("malloc return NULL");
            exit(EXIT_FAILURE);
        }
    }

    printf("Program break is now:    %10p\n", sbrk(0));
    printf("Freeing blocks from %d to %d in steps of %d\n", freeMin, freeMax, freeStep);
    for(j = freeMin - 1; j < freeMax; j += freeStep){
        free(ptr[j]);
    }
    printf("After free(), program break is : %10p\n", sbrk(0));
    printf("\n");
    exit(EXIT_SUCCESS);
}

This one takes from The Linux Programming Interface and I simplifiy a bit.

Chapter 7:

The first two command-line arguments specify the number and size of blocks to allocate. The third command-line argument specifies the loop step unit to be used when freeing memory blocks. If we specify 1 here (which is also the default if this argument is omitted), then the program frees every memory block; if 2, then every second allocated block; and so on. The fourth and fifth command-line arguments specify the range of blocks that we wish to free. If these arguments are omitted, then all allocated blocks (in steps given by the third command-line argument) are freed.

Try run with:

./free_and_sbrk 1000 10240 2
./free_and_sbrk 1000 10240 1 1 999
./free_and_sbrk 1000 10240 1 500 1000

you will see only for the last example, the program break decreases, aka, the process returns some blocks of mem to OS (if I understand correctly).

This sample code is evidence of

"If allocating by brk, the process is probably not returning the memory to the OS before it terminates (sometimes it does)."

At last, quotes some useful paragraph from the book. I suggest reading Chapter 7 (section 7.1) of TLPI, very helpful.

In general, free() doesn’t lower the program break, but instead adds the block of memory to a list of free blocks that are recycled by future calls to malloc(). This is done for several reasons:

The block of memory being freed is typically somewhere in the middle of the heap, rather than at the end, so that lowering the program break is not possible.

It minimizes the number of sbrk() calls that the program must perform. (As noted in Section 3.1, system calls have a small but significant overhead.)

In many cases, lowering the break would not help programs that allocate large amounts of memory, since they typically tend to hold on to allocated memory or repeatedly release and reallocate memory, rather than release it all and then continue to run for an extended period of time.

What is program break (also from the book):

Also: https://www.wikiwand.com/en/Data_segment

free() returns memory to the OS

2 Answers2