9

I was having a discussion regarding using variables with indeterminate values leading to unspecified behavior, rather than undefined behavior, as discussed here. This assuming that a variable with automatic storage duration has its address taken and that trap representations do not apply.

In the specific case it was discussed what happens to ptr after free(ptr), in which case C17 6.2.4 applies:

The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

I made this example:

#include <stdlib.h>
#include <stdio.h>

int main (void)
{
  int* ptr = malloc(sizeof *ptr);
  int* garbage;
  int*volatile* dummy = &garbage; // take the address
  free(ptr);

  puts("This should always print"); 
  fflush(stdout);

  if(ptr == garbage)  
  {
    puts("Didn't see that one coming.");
  }
  else
  {
    puts("I expect this to happen");
  }

  puts("This should always print");
}

The argument I was making was that in theory, we can't know if ptr == garbage is true or false since they are both indeterminate at that point. And so the compiler need not even read those memory locations - since it can deduct that both pointers hold indeterminate values, it is free to evaluate the expression to either true or false as it pleases during optimization. (In practice most compilers probably don't do that.)

I tried the code on x86_64 compilers gcc, icx and clang 14 -std=c17 -pedantic-errors -Wall -Wextra -O3, in all cases I got the output:

This should always print
I expect this to happen
This should always print

However, in clang 15 specifically, I get:

This should always print
This should always print

Followed by error code 139 seg fault.

https://godbolt.org/z/E6xTzc156

If I comment out the "This should always print"/fflush lines, clang 15 makes a dummy executable with the disassembly only consisting of a label:

main:       # @main

Even though main() is containing several side effects.


Question:

Why does clang 15 behave differently than older versions/other compilers? Does it implement trap representations for pointers on the x86_64 I was playing around with or something similar?

Assuming there are no trap representations, none of this code should contain undefined behavior.


EDIT
Regarding how indeterminate values that are not trap representations should be expected to (not) behave, this has been discussed at length in DR 260 and DR 451. The Committee wouldn't be having these long and detailed discussions if the whole thing was to be dismissed as "it is undefined behavior".

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • **Comments have been [moved to chat](https://chat.stackoverflow.com/rooms/252145/discussion-on-question-by-lundin-clang-15-miscompiles-code-accessing-indetermina); please do not continue the discussion here.** Before posting a comment below this one, please review the [purposes of comments](/help/privileges/comment). Comments that do not request clarification or suggest improvements usually belong as an [answer](/help/how-to-answer), on [meta], or in [chat]. Comments continuing discussion may be removed. – blackgreen Feb 26 '23 at 22:06
  • I reduced this question to a smaller one that I believe is (one of) the key point to deciding if this has UB or not: , I believe. – alx - recommends codidact Apr 02 '23 at 15:31

1 Answers1

2

Why is clang doing this? Clang is turning it is unreachable because it takes it as undefined behavior. We can turn it into an explicit trap using -mllvm -trap-unreachable and if we try that with your example clang indeed generates a ud2 for us.

This is part of a larger discussion within the clang community which you can see part of in the discussion of Signed integer overflow causes program to skip the epilogue and fall into another function. Which discusses this issues for the signed overflow case and at the bottom we can see a linked discussion around infinite loops without forward progress.

I sympathize with your frustration that WG14 has seemed to discussion the issue of indeterminate values and there does seem to be some discussions about softening the impact using things such as "wobbly values". The recent C++ proposal Zero-initialize objects of automatic storage duration has this to say:

The WG14 C Standards Committee has had extensive discussions about "wobbly values" and "wobbly bits", specifically around [DR451] and [N1793], summarized in [Seacord].

The C Standards Committee has not reached a conclusion for C23, and wobbly bits continue to wobble indeterminately.

So while this has been discussed many times there is not yet consensus and if we further read the article referenced in that quote Uninitialized Reads: Understanding the proposed revisions to the C language it says:

According to the current WG14 Convener, David Keaton, reading an indeterminate value of any storage duration is implicit undefined behavior in C, and the description in Annex J.2 (which is non-normative) is incomplete. This revised definition of the undefined behavior might be stated as "The value of an object is read while it is indeterminate."

Unfortunately, there is no consensus in the committee or broader community concerning uninitialized reads.

So while there is a variety of ideas in this area they don't have a conclusion yet.

There are folks working to improve the situation but we are not there yet. There is also continued discussion within the compiler community about how aggressive we should be with various undefined behavior but again no conclusion there either.

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • The discussion you linked is more ambivalent since you have volatile pointers to functions and then one may argue back and forth regarding if `p1()` is accessing a volatile object or not. Until C23 then pedantically only accessing volatile objects counts as a side effect, whereas an lvalue access through a volatile-qualified pointer might not always count as one. There's a DR for this which was fixed in C23. – Lundin Feb 28 '23 at 09:34
  • That whole thing doesn't have a lot to do with my example though, other than the `int*volatile* dummy` which didn't improve the situation, it has to be `int**volatile` as someone pointed out in comments, or clang won't treat the access as a side effect. – Lundin Feb 28 '23 at 09:34
  • Regarding the specific issue and "wobbly values", fact remains that C99 relaxed the definition of UB not to contain reads of indeterminate values unless they are trap representations. This is not ambiguous in the standard. Also, passing indeterminate values to library functions is explicitly UB as per some C99 TC. What's ambiguous is how such values should be treated by the compiler in terms of what it can expect. But "lay down to die" is not an acceptable result for something which is _unspecified_ behavior, not UB. – Lundin Feb 28 '23 at 09:53
  • And finally, since every other compiler in the market including clang <15 produces code as expected by the programmer, clang 15 can just ignore the whole standard debate and decide to be a quality implementation instead of Death Station 15.0. Because surely no programmer _expects_ the compiler to generate half an executable. How is that ever going to help a programmer in any situation imaginable? Is the compiler not for programmers? – Lundin Feb 28 '23 at 10:03