15

It has been claimed that

a compiler is free to reuse the pointer variable for some other purpose after the realloc being freed, so you have no guarantee that it has the same value as it did before

ie

void *p = malloc(42);
uintptr_t address = (uintptr_t)p;
free(p);

// [...] stuff unrelated to p or address

assert((uintptr_t)p == address);

might fail.

C11 annex J.2 reads

The value of a pointer that refers to space deallocated by a call to the free or realloc function is used (7.22.3) [is undefined]

but the annex is of course not normative.

Annex L.3 (which is normative, but optional) tells us that if

The value of a pointer that refers to space deallocated by a call to the free or realloc function is used (7.22.3).

the result is permitted to be critical undefined behaviour.

This confirms the claim, but I'd like to see an appropriate quote from the standard proper instead of the annex.

Christoph
  • 164,997
  • 36
  • 182
  • 240
  • 1
    Related: http://stackoverflow.com/questions/17024866/when-is-it-valid-to-access-a-pointer-to-a-dead-object – Oliver Charlesworth Sep 27 '14 at 10:53
  • 2
    That said, it doesn't logically follow that the compiler might "reuse" it. Of course, that is one possible outcome of undefined behaviour. – Oliver Charlesworth Sep 27 '14 at 10:54
  • 2
    `(uintptr_t)p` causes undefined behaviour - you're not allowed to use `p`'s value after freeing it. (It has the same status as an uninitialized variable) – M.M Sep 27 '14 at 11:08
  • @georgem: yes, this apparently is UB; so pointers aren't just integers with sugar - they can be invalidated if *passed by value* to 'magic' functions; personally, I think this violates POLA, but it is what it is... – Christoph Sep 27 '14 at 12:35
  • @georgem: well, there's some sense in not being able to read uninitialized variables - eg they could be initialized with magic values (trap representations, signalling NaNs); I'd be fine with that if you could still get at the value through a `char*`; however, in case of freed pointers, we already know that the variable did not hold a trap representation, so like you, I was expecting that as long as no indirection happens, everything would be fine... – Christoph Sep 27 '14 at 12:50
  • 1
    "it doesn't logically follow that the compiler might "reuse" it." -- It does if you understand what the word "might" means, and basic modal logic. – Jim Balter Sep 27 '14 at 19:19
  • @Giorgi: IMHO, C would be a better language if there were a few specific cases where dead pointers would yield deterministic behavior: compare pointers to things with overlapping lifetimes (including the result of realloc and its argument), subtract one formerly-live pointer from another *to the same object*, etc., and also if some nuisance forms of UB were replaced by partially-constrained behaviors. – supercat Apr 28 '15 at 22:49

1 Answers1

16

Upon an object reaching the end of its lifetime, all pointers to it become indeterminate. This applies to block-scope variables and to malloced memory just the same. The applicable clause is, in C11, 6.2.4:2.

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

Using indeterminate memory for anything, including apparently harmless comparison or arithmetic, is undefined behavior (in C90; later standards complicate the matter terribly but compilers continue to treat usage of indeterminate memory as undefined behavior).

As an example, how about the following program printing that p and q are both different and the same? The results of execution with various compilers are shown here.

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

int main(int argc, char *argv[]) {
  char *p, *q;
  uintptr_t pv, qv;
  {
    char a = 3;
    p = &a;
    pv = (uintptr_t)p;
  }
  {
    char b = 4;
    q = &b;
    qv = (uintptr_t)q;
  }
  printf("Roses are red,\nViolets are blue,\n");
  if (p == q)
    printf ("This poem is lame,\nIt doesn't even rhyme.\n");
  else {
    printf("%p is different from %p\n", (void*)p, (void*)q);
    printf("%"PRIxPTR" is not the same as %"PRIxPTR"\n", pv, qv);
  }
}
Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • 1
    if I could, I'd give you another upvote for http://blog.regehr.org/archives/1180 - though I don't agree with all proposals, it's food for thought... – Christoph Sep 27 '14 at 12:38
  • @Christoph It would be completely against the spirit of SO :) – Pascal Cuoq Sep 27 '14 at 13:04
  • @PascalCuoq: What do you think of http://en.cppreference.com/w/c/language/analyzability ? If trapping were optional, and it were changed to allow a few presently-illegal pointer operations, I think it should be pretty easy for compilers to implement [if they mostly ignore the trapping part]. – supercat Apr 28 '15 at 21:54
  • @supercat It's interesting but the ball is in the court of the compiler implementers. – Pascal Cuoq Apr 28 '15 at 22:15
  • @PascalCuoq: I would think if anyone got on the ball and implemented it, even without trapping, it would pretty quickly become popular; it could also in many cases yield *better* code than would be possible if presently-UB actions are forbidden. Given `int x`, allowing a compiler to assume that `x+1 > x` will always either be true or unspecified enables useful optimizations which won't be possible if programmers have to rewrite `x++` as `x=x+1u;` I still want to see evidence of any major savings which would exceed those which could be achieved with simpler optimization-enabling features. – supercat Apr 28 '15 at 22:42
  • @PascalCuoq: What would a compiler have to do to legitimately set `__STDC_ANALYZABLE__` other than (1) preceding division instructions with conditional-skip for zero divisor, (2) providing stubs for the trap-setting functions (*if nothing traps, they need not do anything except set/return a function pointer*), and (3) check that certain optimizations are turned off? I would think those things should be an afternoon's work. – supercat Apr 29 '15 at 16:12
  • 1
    This: http://blog.regehr.org/archives/1180, is a very sensible idea, I hope this idea gets implemented at some point – Giorgi Moniava Feb 21 '16 at 20:29
  • is there any information if blog.regehr.org/archives/1180 is planing to be implemented? it would be especially nice to see friendly version for all 200 occurrences of UB. Would make C much more friendly language. – Giorgi Moniava Feb 21 '16 at 21:21
  • @Giorgi Have you seen http://blog.regehr.org/archives/1287 ? The problem is that not everyone wants the same trade-offs even when everyone is disappointed by the choices made by current compilers. Some UB are very expensive to detect (google "memory safe C compiler" for some articles and implementations, with a slowdown between x5 and x50 depending on other factors). This, soon to be released, is intended to find all UB that matter in sequential code, and it comes close: http://trust-in-soft.com/tis-interpreter/ – Pascal Cuoq Feb 21 '16 at 21:56
  • @PascalCuoq I will get familiar with the links you provided. IMO there is no excuse for existence of 200+ UB. Speed? ok. But reliability and productivity are not less important, having so many places where one can shoot himself in foot is bad language decision. I hope friendly C gets implemented at some point. Or we get better tools like static analyzers, which detect these. I tried some static analyzer once, but they have problems when see functions from external SDKs... I guess in that case you have to manually show them function signatures... don't remember how I overcame that – Giorgi Moniava Feb 21 '16 at 22:02
  • I've come here twice after having exhausted all my votes... I hope the third time I can upvote this :D – Antti Haapala -- Слава Україні Oct 16 '17 at 19:18