0

The setup

Let's say I have a struct father which has member variables such as an int, and another struct(so father is a nested struct). This is an example code:

struct mystruct {
    int n;
};

struct father {
    int test;
    struct mystruct M;
    struct mystruct N;
};

In the main function, we allocate memory with malloc() to create a new struct of type struct father, then we fill it's member variables and those of it's children:

    struct father* F = (struct father*) malloc(sizeof(struct father));
    F->test = 42;
    F->M.n = 23;
    F->N.n = 11;

We then get pointers to those member variables from outside the structs:

    int* p = &F->M.n;
    int* q = &F->N.n;

After that, we print the values before and after the execution of free(F), then exit:

    printf("test: %d, M.n: %d, N.n: %d\n", F->test, *p, *q);
    free(F);
    printf("test: %d, M.n: %d, N.n: %d\n", F->test, *p, *q);
    return 0;

This is a sample output(*):

test: 42, M.n: 23, N.n: 11
test: 0, M.n: 0, N.n: 1025191952

*: Using gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

Full code on pastebin: https://pastebin.com/khzyNPY1

The question

That was the test program that I used to test how memory is deallocated using free(). My idea(from reading K&R "8.7 Example - A Storage Allocator", in which a version of free() is implemented and explained) is that, when you free() the struct, you're pretty much just telling the operating system or the rest of the program that you won't be using that particular space in memory that was previously allocated with malloc(). So, after freeing those memory blocks, there should be garbage values in the member variables, right? I can see that happening with N.n in the test program, but, as I ran more and more samples, it was clear that in the overwhelming majority of cases, these member variables are "reset" to 0 more than any other "random" value. My question is: why is that? Is it because the stack/heap is filled with zeroes more frequently than any other value?


As a last note, here are a few links to related questions but which do not answer my particular question:

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
  • _So, after freeing those memory blocks, there should be garbage values in the member variables, right?_: no absolutely not. There may be garbage or not, it's indetermined. – Jabberwocky Oct 27 '20 at 17:23
  • That space may have been allocated to some other task which has initialized the memory to 0. – stark Oct 27 '20 at 17:24
  • 1
    Oh and just the title: _why are integer member variables of a struct frequently reset to 0 when it is deallocated with free()?_: they are not, it would be inefficient. – Jabberwocky Oct 27 '20 at 17:28

5 Answers5

3

After calling free, the pointers F, p and q no longer point to valid memory. Attempting to dereference those pointers invokes undefined behavior. In fact, the values of those pointers become indeterminate after the call to free, so you may also invoke UB just by reading those pointer values.

Because dereferencing those pointers is undefined behavior, the compiler can assume it will never happen and make optimizations based on that assumption.

That being said, there's nothing that states that the malloc/free implementation has to leave values that were stored in freed memory unchanged or set them to specific values. It might write part of its internal bookkeeping state to the memory you just freed, or it might not. You'd have to look at the source for glibc to see exactly what it's doing.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • Thank you for your answer. But still, it didn't answer the question of why is the 0 value more frequent than any other value. – Vinicius Almeida Oct 28 '20 at 01:39
  • @ViniciusAlmeida It's a detail of the compiler implementation and the system library implementation. If you really want to know you need to dig deep into those. It's not something most C developers need to know. – dbush Oct 28 '20 at 01:46
2

Apart from undefined behavior and whatever else the standard might dictate, since the dynamic allocator is a program, fixed a specific implementation, assuming it does not make decisions based on external factors (which it does not) the behavior is completely deterministic.

Real answer: what you are seeing here is the effect of the internal workings of glibc's allocator (glibc is the default C library on Ubuntu).

The internal structure of an allocated chunk is the following (source):

struct malloc_chunk {
    INTERNAL_SIZE_T      mchunk_prev_size;  /* Size of previous chunk (if free).  */
    INTERNAL_SIZE_T      mchunk_size;       /* Size in bytes, including overhead. */
    struct malloc_chunk* fd;                /* double links -- used only if free. */
    struct malloc_chunk* bk;        
    /* Only used for large blocks: pointer to next larger size.  */
    struct malloc_chunk* fd_nextsize;       /* double links -- used only if free. */
    struct malloc_chunk* bk_nextsize;
};

In memory, when the chunk is in use (not free), it looks like this:

chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             Size of previous chunk, if unallocated (P clear)  |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             Size of chunk, in bytes                     |A|M|P| flags
  mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             User data starts here...                          |

Every field except mchunk_prev_size and mchunk_size is only populated if the chunk is free. Those two fields are right before the user usable buffer. User data begins right after mchunk_size (i.e. at the offset of fd), and can be arbitrarily large. The mchunk_prev_size field holds the size of the previous chunk if it's free, while the mchunk_size field holds the real size of the chunk (which is at least 16 bytes more than the requested size).

A more thorough explanation is provided as comments in the library itself here (highly suggested read if you want to know more).

When you free() a chunk, there are a lot of decisions to be made as to where to "store" that chunk for bookkeeping purposes. In general, freed chunks are sorted into double linked lists based on their size, in order to optimize subsequent allocations (that can get already available chunks of the right size from these lists). You can see this as a sort of caching mechanism.

Now, depending on your glibc version, they could be handled slightly differently, and the internal implementation is quite complex, but what is happening in your case is something like this:

struct malloc_chunk *victim = addr; // address passed to free()

// Add chunk at the head of the free list
victim->fd = NULL;
victim->bk = head;
head->fd = victim;

Since your structure is basically equivalent to:

struct x {
    int a;
    int b;
    int c;
}

And since on your machine sizeof(struct malloc_chunk *) == 2 * sizeof(int), the first operation (victim->fd = NULL) is effectively wiping out the contents of the first two fields of your structure (remember, user data begins exactly at fd), while the second one (victim->bk = head) is altering the third value.

Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
  • 1
    Programs “fixed [in] a specific implementation” are not always deterministic, due to [address space layout randomization](https://en.wikipedia.org/wiki/Address_space_layout_randomization) and other factors. – Eric Postpischil Oct 27 '20 at 18:08
  • Well of course... the behavior is still 100% deterministic though, granted that the program does not make assumptions based on its position in memory, which AFAIK ptmalloc doesn't. – Marco Bonelli Oct 27 '20 at 18:19
  • Oh I think I understand now! So when the data is wiped(`victim->fd = NULL`) the bits are set to 0 in the blocks that were previously malloc'd? – Vinicius Almeida Oct 28 '20 at 01:45
  • 1
    @ViniciusAlmeida exactly. – Marco Bonelli Oct 28 '20 at 01:46
1

The Standard specifies nothing about the behavior of a program that uses a pointer to allocated storage after it has been freed. Implementations are free to extend the language by specifying the behavior of more programs than required by the Standard, and the authors of the Standard intended to encourage variety among implementations which would support popular extensions on a quality-of-implementation basis directed by the marketplace. Some operations with pointers to dead objects are widely supported (e.g. given char *x,*y; the Standard would allow conforming implementations to behave in arbitrary fashion if a program executes free(x); y=x; in cases where x had been non-null, without regard for whether anything ever does anything with y after its initialization, but most implementations would extend the language to guarantee that such code would have no effect if y is never used) but dereferencing of such pointers generally isn't.

Note that if one were to pass two copies of the same pointer to a freed object to:

int test(char *p1, char *p2)
{
  char *q;
  if (*p1)
  {
    q = malloc(0):
    free(q);
    return *p1+*p2;
  }
  else
    return 0;
}

it is entirely possible that the act of allocating and freeing q would disturb the bit patterns in the storage that had been allocated to *p1 (and also *p2), but a compiler would not be required to allow for that possibility. A compiler might plausibly return the sum of the value that was read from *p1 before the malloc/free, and a value that was read from *p2 after it; this sum could be an odd number even though if p1 and p2 are equal, *p1+*p2 should always be even.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

When a dynamically allocated object is freed, it no longer exists. Any subsequent attempt to access it has undefined behavior. The question is therefore nonsense: the members of an allocated struct cease to exist at the end of the host struct's lifetime, so they cannot be set or reset to anything at that point. There is no valid way to attempt to determine any values for such no-longer-existing objects.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • Problem is: what does it **mean** for something to "cease to exist"? Is the data wiped? How it is implemented? Semantically, I understand that I can't acces that space in memory after freeing, but my question is **what happens afterwards**? From the other answers, I now understand that this question is implementation, and OS-specific, but to me that wasn't clear from the beginning. Still, thank you for your answer. – Vinicius Almeida Oct 28 '20 at 01:48
0

Two things happen when you call free:

  • In the C model of computing, any pointer values that point to the freed memory (either its beginning, such as your F, or things within it, such as your p and q) are no longer valid. The C standard does not define what happens when you attempt to use these pointer values, and optimization by the compiler may have unexpected effects on how your program behaves if you attempt to use them.
  • The freed memory is released for other purposes. One of the most common other purposes for which it is used is tracking memory that is available for allocation. In other words, the software that implements malloc and free needs data structures to record which blocks of memory have been freed and other information. When you free memory, that software often uses some of the memory for this purpose. That can result in the changes you saw.

The freed memory may also be used by other things in your program. In a single-threaded program without signal handlers or similar things, generally no software would run between the free and the preparation of the arguments to the printf you show, so nothing else would reuse the memory so quickly—reuse by the malloc software is the most likely explanation for what you observed. However, in a multithreaded program, the memory might be reused immediately by another thread. (In practice, this may be a bit unlikely, as the malloc software may keep preferentially separate pools of memory for separate threads, to reduce the amount of inter-thread synchronization that is necessary.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312