2

When you free memory, what happens to pointers that point into that memory? Do they become invalid immediately? What happens if they later become valid again?

Certainly, the usual case of a pointer going invalid then becoming "valid" again would be some other object getting allocated into what happens to be the memory that was used before, and if you use the pointer to access memory, that's obviously undefined behavior. Dangling pointer memory overwrite lesson 1, pretty much.

But what if the memory becomes valid again for the same allocation? There's only one Standard way for that to happen: realloc(). If you have a pointer to somewhere within a malloc()'d memory block at offset > 1, then use realloc() to shrink the block to less than your offset, your pointer becomes invalid, obviously. If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block, is the dangling pointer valid again?

This is such a corner case that I don't really know how to interpret the C or C++ standards to figure it out. The below is a program that shows it.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    static const char s_message[] = "hello there";
    static const char s_kitty[] = "kitty";

    char *string = malloc(sizeof(s_message));
    if (!string)
    {
        fprintf(stderr, "malloc failed\n");
        return 1;
    }

    memcpy(string, s_message, sizeof(s_message));
    printf("%p %s\n", string, string);

    char *overwrite = string + 6;
    *overwrite = '\0';
    printf("%p %s\n", string, string);

    string[4] = '\0';
    char *new_string = realloc(string, 5);
    if (new_string != string)
    {
        fprintf(stderr, "realloc #1 failed or moved the string\n");
        free(new_string ? new_string : string);
        return 1;
    }
    string = new_string;
    printf("%p %s\n", string, string);

    new_string = realloc(string, 6 + sizeof(s_kitty));
    if (new_string != string)
    {
        fprintf(stderr, "realloc #2 failed or moved the string\n");
        free(new_string ? new_string : string);
        return 1;
    }

    // Is this defined behavior, even though at one point,
    // "overwrite" was a dangling pointer?
    memcpy(overwrite, s_kitty, sizeof(s_kitty));
    string[4] = s_message[4];
    printf("%p %s\n", string, string);
    free(string);
    return 0;
}
Myria
  • 3,372
  • 1
  • 24
  • 42
  • Well, you may as well of asked 'what happens if I write bugs in my program'. The pointers are invalid, but dereferencing them is UB, even if the same memory block happens to become allocated again after another malloc. – Martin James Sep 27 '14 at 08:25
  • A pointer to free'ed memory may be invalid but it may still function. This depends on if the memory changed. If it was "free'ed" but still contains the same values(usually the case) then the code will work until that memory changes, in which case your program will probably crash... leading to hard to track bugs because it is not deterministic. Run the program, it crashes at doing X run it again and it never crashes... all because your pointer weren't updated. – AbstractDissonance Jul 14 '16 at 23:11

3 Answers3

8

When you free memory, what happens to pointers that point into that memory? Do they become invalid immediately?

Yes, definitely. From section 6.2.4 of the C standard:

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

And from section 7.22.3.5:

The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.

Note the reference to old object and new object ... by the standard, what you get back from realloc is a different object than what you had before; it's no different from doing a free and then a malloc, and there is no guarantee that the two objects have the same address, even if the new size is <= the old size ... and in real implementations they often won't because objects of different sizes are drawn from different free lists.

What happens if they later become valid again?

There's no such animal. Validity isn't some event that takes place, it's an abstract condition placed by the C standard. Your pointers might happen to work in some implementation, but all bets are off once you free the memory they point into.

But what if the memory becomes valid again for the same allocation? There's only one Standard way for that to happen: realloc()

Sorry, no, the C Standard does not contain any language to that effect.

If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block

You can't know whether it will ... the standard does not guarantee any such thing. And notably, when you realloc to a smaller size, most implementations modify the memory immediately following the shortened block; reallocing back to the original size will have some garbage in the added part, it won't be what it was before it was shrunk. In some implementations, some block sizes are kept on lists for that block size; reallocating to a different size will give you totally different memory. And in a program with multiple threads, any freed memory can be allocated in a different thread between the two reallocs, in which case the realloc for a larger size will be forced to move the object to a different location.

is the dangling pointer valid again?

See above; invalid is invalid; there's no going back.

This is such a corner case that I don't really know how to interpret the C or C++ standards to figure it out.

It's not any sort of corner case and I don't know what you're seeing in the standard, which is quite clear that freed memory has indeteterminate content and that the values of any pointers to or into it are also indeterminate, and makes no claim that they are magically restored by a later realloc.

Note that modern optimizing compilers are written to know about undefined behavior and take advantage of it. As soon as you realloc string, overwrite is invalid, and the compiler is free to trash it ... e.g., it might be in a register that the compiler reallocates for temporaries or parameter passing. Whether any compiler does this, it can, precisely because the standard is quite clear about pointers into objects becoming invalid when the object's lifetime ends.

Jim Balter
  • 16,163
  • 3
  • 43
  • 66
  • I know about all the cases in which the memory could move to a different location - that's why my code checks for the pointer changing values, and aborts if this happens. It also fills the trashed bytes back in. It was only your last paragraph that I was interested in. I suppose, however, that even the checks for whether the allocation moved are themselves invalid, because comparisons of pointers to different memory blocks return unspecified (but not undefined) results, meaning those if statements could have no meaning on theoretical implementations. – Myria Sep 29 '14 at 09:05
0

If you then use realloc() again grow the block back to at least cover the object type pointed to by the dangling pointer, and in neither case did realloc() move the memory block, is the dangling pointer valid again?

No. Unless realloc() returns a null pointer, the call terminates the lifetime of the allocated object, implying that all pointers pointing into it become invalid. If realloc() succeeds, it returns the address of a new object.

Of course, it just might happen that it's the same address as the old one. In that case, using an invalid pointer to the old object to access the new one will generally work in non-optimizing implementations of the C language.

It would still be undefined behaviour, though, and might actually fail with aggressively optimizing compilers.

The C language is unsound, and it's generally up to the programmer to uphold its invariants. Failing to do so will break the implicit contract with the compiler and may result in incorrect code being generated.

Community
  • 1
  • 1
Christoph
  • 164,997
  • 36
  • 182
  • 240
  • "the new one will work in most implementations of the C language" -- actually not, as most implementations will modify the freed section. "The C language is unsound" -- Indeed! – Jim Balter Sep 27 '14 at 09:11
  • @JimBalter: regardless of the fact of whether or not the *value* of the object will have been modified (and I don't see any reason for `realloc()` - not `free()` - to do so in this specific example, and we're talking about trailing bytes, whereas meta-data is generally kept up front), the *pointer* will point to the correct place within the object – Christoph Sep 27 '14 at 09:17
  • Ahem. When you realloc a block to a smaller size (which frees the tail of the block), there are now two blocks, and the front of the second one is in the middle of the original larger block ... reallocating back will get you that metadata. " the pointer will point to the correct place within the object" -- it might or might not; there is no guarantee that the pair of reallocs get you the same memory back. Even reallocing to a smaller size can return a totally different piece of memory. – Jim Balter Sep 27 '14 at 09:20
  • @JimBalter: ah, I didn't think that through; but note that the second block might be too small to be put back into a free list; anyway, my main point was about the address calculation, not the value – Christoph Sep 27 '14 at 09:22
  • If the second block is too small to be put back on the free list then the realloc is a no-op ... otherwise malloc would leak memory. And *validity* is entirely about content. And note my edit to my comment above: you have no idea where the data is located after a realloc. FWIW, I used to write C library code, including malloc/realloc/free, for a living. – Jim Balter Sep 27 '14 at 09:25
  • Your comment is utter nonsense, a bizarre strawman. Of course *if* the old and new address are equal then `an_element`, which hasn't changed, still points into the block, but so what? There's no guarantee that they *will* be equal, and if you *read* from the pointer instead of *writing* to it, there's no way to know what you'll get. There's no need to "try to convey" such things to me and I'm not interested in such ridiculous discussions. – Jim Balter Sep 27 '14 at 10:00
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/62034/discussion-between-christoph-and-jim-balter). – Christoph Sep 27 '14 at 10:01
  • No, let's not. Goodbye. (And you should remove your comment with my name from your answer ... that's quite inappropriate at SO.) – Jim Balter Sep 27 '14 at 10:03
  • @JimBalter: I'm not quite sure how I managed to piss you off, but have a nice weekend anyway – Christoph Sep 27 '14 at 10:05
  • 1
    Rather than get into that, I will make the same point I made in response to cmaster's answer: The fact is that a compiler is free to reuse the pointer variable for some other purpose after the realloc, so you have no guarantee that it has the same value as it did before. The behavior is *undefined*, period. Talking about what "probably" will be the case is pointless, especially when modern compilers are making more and more aggressive use of UB for optimization. – Jim Balter Sep 27 '14 at 10:14
  • @JimBalter: interesting; see http://stackoverflow.com/questions/26073842/is-the-compiler-allowed-to-recycle-freed-pointer-variables – Christoph Sep 27 '14 at 10:48
  • @JimBalter: Given `int *p = malloc(4); intptr_t q=0,r=0; memcpy(&q, &p, sizeof p); free(p); memcpy(&r, &p, sizeof p);` is there anything in the Standard that would allow `q == r` to be false if `sizeof intptr_t` and `sizeof p` are equal? – supercat Apr 28 '15 at 22:54
-1

It depends on your definition of "valid". You've perfectly described the situation. If you want to consider that "valid", then it's valid. If you don't want to consider that "valid", then it's invalid.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • 1
    the C standard considers it invalid – Christoph Sep 27 '14 at 08:50
  • @Christoph Nonsense. Consider: `if (realloc(foo, 32) == foo) { /* HERE */ }`. Surely, `foo` is perfectly valid there, even though it was passed to realloc. Remember, we are given that a subsequent call to realloc returned the same pointer before we accessed it. – David Schwartz Sep 27 '14 at 09:33
  • 2
    again, not according to the C standard: unless `realloc()` fails, the object gets de-allocated, ending it's lifetime and making all pointers invalid as far as abstract language semantics are concerned; sure, it will work in practice, but that doesn't make it valid as far as the standard goes... – Christoph Sep 27 '14 at 09:50
  • In the OP's code, `overwrite` is set to string + 6. After the realloc of string to 5 bytes, an optimizing compiler is free to trash the value of `overwrite` because it is no longer valid. Call it nonsense all you want, but the behavior is invalid by the standard and can fail in practice. – Jim Balter Sep 27 '14 at 10:18