-3

N1570 (the closest draft to the C11 standard) states that:

J.2 Undefined behavior

1 The behavior is undefined in the following circumstances:

[snip]

  • The value of a pointer that refers to space deallocated by a call to the free or realloc function is used (7.22.3).

Note that this says "used", not "dereferenced", which has some serious implications. According to my reading, this is UB:

void* foo = malloc(4);
free(foo);
printf("%p\n", foo); // UB!

In fact, whatever the value of foo was, it appears that it's now permanently tainted.

Of course, compilers aren't required to do any kind of enforcement here (since "It Just Works" is valid undefined behavior), and my opinion is that any kind of enforcement here would only trip honest developers (unless you're on a platform where merely loading an invalid address causes issues?...).

However, it seems to me that if I wanted to write a pathological C compiler, I could make it track calls to realloc and free, check every pointer ever used against the list of pointers that have been freed, and make demons come out of the user's nose if I find one that matches.

The funny part is that every modern malloc implementation out there will happily reuse allocations that have been previously freed:

void* foo = malloc(4);
free(foo);
foo = malloc(4); // potentially the same address as the previous allocation

Would a pathological C compiler, written by authors with the most hostile reading of the standard, be allowed to check that malloc returned the same pointer the second time and make the program do strange things just because?

Community
  • 1
  • 1
zneak
  • 134,922
  • 42
  • 253
  • 328
  • UB is not introduced by compiler/implementation.. It is just something the standard is not defining. – Eugene Sh. Oct 17 '16 at 21:06
  • @EugeneSh., so how do you call the behavior that your program exhibits when you hit "something that the standard is not defining", if it shouldn't be called "undefined behavior"? – zneak Oct 17 '16 at 21:08
  • 1
    Some "modern implementation" may gracefully handle it, while some "pathological" one may not. And both are OK with the standard. *This* is the meaning of undefined behavior. – Eugene Sh. Oct 17 '16 at 21:09
  • So, basically you are questioning the wording in terms of a difference between using the pointer value and the pointed-to-value? I think the -1 voters didn't read carefully enough – grek40 Oct 17 '16 at 21:09
  • 2
    You acknowledge that "*"It Just Works" is valid undefined behavior*", but your question indicates a misunderstanding of what "undefined behavior" means. It is simply behavior for which the standard imposes no requirements. Note also that Annex J is not normative, and that particular wording is IMHO a bit sloppy. – Keith Thompson Oct 17 '16 at 21:10
  • @KeithThompson, doesn't it sound somewhat problematic that a compiler can behave any way it likes when a malloc implementation returns the same pointer twice? – zneak Oct 17 '16 at 21:11
  • @zneak: It's the program's behavior, not the compiler's behavior. Suppose two calls to `malloc()` happen to return the "same" value. How can a program determine that they're the same? Any attempt to refer to the original pointer value after it's been passed to `free()` has undefined behavior. – Keith Thompson Oct 17 '16 at 21:13
  • Could it happen you confuse what the compiler does and what is a matter of the execution environment (here: the standard library), i.e. compile-time vs run-time? – too honest for this site Oct 17 '16 at 21:14
  • After like 5th reading, I begin to understand... you basically say that in the case of `x = malloc(1); free(x); x = malloc(1);` if the second malloc returns the same address, the standard says that access would be UB even when accessing the malloc-ed value? – grek40 Oct 17 '16 at 21:14
  • 1
    I think that would qualify as a "hostile reading" of the standard. – Kerrek SB Oct 17 '16 at 21:15
  • @grek40: Given `x = malloc(1); free(x); y = malloc(1);`, the comparison `x == y` has undefined behavior, because the value of `x` is indeterminate (even if happens to hold the same bit pattern as `y`.) – Keith Thompson Oct 17 '16 at 21:15
  • If that's the case, it would be a *terrible* nitpicking.. – Eugene Sh. Oct 17 '16 at 21:15
  • 3
    The intent is clearly that the pointer refers to space that's been deallocated *and not subsequently allocated*. I believe that can be inferred from the *normative* wording of the standard. – Keith Thompson Oct 17 '16 at 21:16
  • @KerrekSB Attributing "non-hostile writing" to the creators of language standards qualifies as 'naive' in some cases ;) – grek40 Oct 17 '16 at 21:19
  • I dunno why it is downvoted so badly.. – Eugene Sh. Oct 17 '16 at 21:22
  • 1
    Assuming you mean what @grek40 wrote: Strictly speaking, the newly allocated value is not the same value as the formerly free'd, even iff the bit-patterns match. So, the cited part of the appendix does not apply. – too honest for this site Oct 17 '16 at 21:27
  • Re the recent question edit "make the program do strange things just because", I would not use any compiler that enforces bad behaviour to teach a lesson to the naive user. Undefined behaviour is not a direction to the compiler. It is lack of direction. – Weather Vane Oct 17 '16 at 21:36
  • ... taking your preposterism ad absurdum you would have a compiler that instead of warning about array bounds, would inject malicious code to deliberately set fire to your cat, as they say. – Weather Vane Oct 17 '16 at 21:48
  • @Eugene Sh. tough crowd tonight. – chux - Reinstate Monica Oct 17 '16 at 21:57
  • 1
    @WeatherVane, of course this example is absurd because there's not much to gain from checking that it's the first time that malloc returned a pointer. However, I'm sure that you know that C compilers are notorious for [aggressively pretending that UB can't happen to optimize programs](https://lwn.net/Articles/342330/) (which, of course, is legal, because the program behaves strangely only if the prior assumption has been violated). – zneak Oct 17 '16 at 22:22

2 Answers2

4

Does that mean that they all introduce UB into C programs?

Re-usage of the previously freed memory block by malloc does not introduce UB by itself

Usage of a pointer that refers to space deallocated

Usage of old pointer (and thus assuming old data or structure behind it) - is UB

example:

char * p1 = malloc(...);
free(p1);
double * p2 = malloc(...);
// lets assume p2 == p1 (ie malloc implementation re-used address)
// here you can use p2 without any problems or UB
// BUT, usage of P1 here is UB
free(p2);
// usage of P1 and P2 is UB here
Iłya Bursov
  • 23,342
  • 4
  • 33
  • 57
4

The funny part is that every modern malloc implementation out there will happily reuse allocations that have been previously freed. Does that mean that they all introduce UB into C programs?

No.

The issue isn't whether malloc returns the same value in different invocations, the issue is whether the pointer value is valid when it is used in an expression.

You can assign a new value (valid or otherwise) to an existing pointer object without invoking UB. What invokes UB is attempting to use an invalid pointer value in an expression, whether it's to dereference the pointer, or to do pointer arithmetic, etc.

T *ptr = malloc( sizeof *ptr * N );
...
free( ptr );

At this point, the value contained in ptr is invalid (that pointer value is no longer associated with an object in your program). Using the value of ptr at this point (either by dereferencing it, or by doing pointer arithmetic with it, etc.) invokes UB. However, if you do

ptr = malloc( sizeof *ptr * N );

again, and get the same pointer value the second time, then the value is valid again - that pointer value has been re-associated with an object in your program.

If the value returned by malloc/calloc/realloc isn't NULL, then it's a valid pointer value and using it will not invoke UB, regardless of the bookkeeping behind it.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • The value isn't "valid again". There's a new , valid value that happens to have the same representation as the old value. It would still be UB to use the value of a variable that contained the old value, even though the representations are identical – M.M Sep 06 '18 at 03:30