Is it always undefined behaviour to copy the bits of a variable through an incompatible pointer?

Question

For example, can this

unsigned f(float x) {
    unsigned u = *(unsigned *)&x;
    return u;
}

cause unpredictable results on a platform where,

unsigned and float are both 32-bit
a pointer has a fixed size for all types
unsigned and float can be stored to and loaded from the same part of memory.

I know about strict aliasing rules, but most examples showing problematic cases of violating strict aliasing is like the following.

static int g(int *i, float *f) {
    *i = 1;
    *f = 0;
    return *i;
}

int h() {
    int n;
    return g(&n, (float *)&n);
}

In my understanding, the compiler is free to assume that i and f are implicitly restrict. The return value of h could be 1 if the compiler thinks *f = 0; is redundant (because i and f can't alias), or it could be 0 if it puts into account that the values of i and f are the same. This is undefined behaviour, so technically, anything else can happen.

However, the first example is a bit different.

unsigned f(float x) {
    unsigned u = *(unsigned *)&x;
    return u;
}

Sorry for my unclear wording, but everything is done "in-place". I can't think of any other way the compiler might interpret the line unsigned u = *(unsigned *)&x;, other than "copy the bits of x to u".

In practice, all compilers for various architectures I tested in https://godbolt.org/ with full optimization produce the same result for the first example, and varying results (either 0 or 1) for the second example.

I know it's technically possible that unsigned and float have different sizes and alignment requirements, or should be stored in different memory segments. In that case even the first code won't make sense. But on most modern platforms where the following holds, is the first example still undefined behaviour (can it produce unpredictable results)?

unsigned and float are both 32-bit
a pointer has a fixed size for all types
unsigned and float can be stored to and loaded from the same part of memory.

In real code, I do write

unsigned f(float x) {
    unsigned u;
    memcpy(&u, &x, sizeof(x));
    return u;
}

The compiled result is the same as using pointer casting, after optimization. This question is about interpretation of the standard about strict aliasing rules for code such as the first example.

You write to x (copy) using type float, you read x using type unsigned int => aliasing violation. Problems are less likely for this case, but it is still illegal. — Marc Glisse, Apr 03 '22 at 11:01
duplicates: [If this is undefined behavior then why is it given as a seemingly legitimate example?](https://stackoverflow.com/q/64562736/995714), [Why I get a "type-punned" warning even when using a `char *`?](https://stackoverflow.com/q/42084824/995714), [Why is type punning considered UB?](https://stackoverflow.com/q/63422076/995714), [Unions and type-punning](https://stackoverflow.com/q/25664848/995714). Different types may lies in different domains (float vs integer) or even different memory spaces so it's forbidden. The only valid way in C is to use a `union` — phuclv, Apr 03 '22 at 11:11
The standard doesn't specify different rules for different aliasing cases, so the above examples all have aliasing and have UB — phuclv, Apr 03 '22 at 11:15
Undefined behaviour is something specified by the Standard. The fact that it (always) produces the same/desired result in a specific case, on a specific platform, does not make it any less undefined (by the Standard). — Adrian Mole, Apr 03 '22 at 11:29
Undefined Behavior does not mean unpredictable result. It is just not defined by the standard. — the busybee, Apr 03 '22 at 11:39
@AdrianMole: The question is not predicated upon the behavior in a specific case. It asks whether there is any case in a practical compiler where the aliasing would not be performed as the nominal meaning of the code indicates, in spite of the standard’s aliasing rules. The title needs to be reworded, but the actual question is not about whether the behavior is defined by the C standard or not. The question is about whether rational compiler design would lead to a failure of the no inal aliasing. — Eric Postpischil, Apr 03 '22 at 11:55
` This question is about interpretation of the standard` What is unclear to you? — KamilCuk, Apr 03 '22 at 12:07
Your godbolt link forgot to link the actual code and compilers you tried. https://godbolt.org/z/rvj3d4e4o shows GCC4.1, from before GCC went out of its way to support "obvious" cases like this, to sometimes not break people's buggy code using this compact but UB idiom. It loads garbage from stack memory, unless you use `-fno-strict-aliasing` to make it `movd` to that location first. It also warns `warning: dereferencing type-punned pointer will break strict-aliasing rules`. — Peter Cordes, Apr 04 '22 at 03:31
@PeterCordes That's kind of surprising to me. I personally think the strict aliasing rules are the most counter-intuitive in a low-level language like C, especially on architectures like x86 where any data can be stored anywhere in (accessible) memory. Is there actual cases where strict aliasing helps better code generation, in which the same cannot be achieved with `restrict`? — xiver77, Apr 04 '22 at 03:58
@EricPostpischil: Is the question about a rational compiler design which makes a bona fide effort to avoid braeking useful programs, or about free compilers like gcc? — supercat, Apr 05 '22 at 15:05

KamilCuk · Accepted Answer · 2022-04-03T12:39:06.420

3

Is it always undefined behaviour to copy the bits of a variable through an incompatible pointer?

Yes.

The rule is https://port70.net/~nsz/c/c11/n1570.html#6.5p7 :

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

a type compatible with the effective type of the object,

a qualified version of a type compatible with the effective type of the object,

a type that is the signed or unsigned type corresponding to the effective type of the object,

a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

a character type.

The effective type of the object x is float - it is defined with that type.

unsigned is not compatible with float,
unsigned is not a qualified version of float,
unsigned is not a signed or unsigned type of float,
unsigned is not a signed or unsigned type corresponding to qualified version of float,
unsigned is not an aggregate or union type
and unsigned is not a character type.

The "shall" is violated, it is undefined behavior (see https://port70.net/~nsz/c/c11/n1570.html#4p2 ). There is no other interpretation.

We also have https://port70.net/~nsz/c/c11/n1570.html#J.2 :

The behavior is undefined in the following circumstances:

An object has its stored value accessed other than by an lvalue of an allowable type (6.5).

edited Apr 03 '22 at 12:39

answered Apr 03 '22 at 12:11

KamilCuk

120,984
8
59
111

2

Thanks for the clarification. The standard is a logical and concise piece of document, but it's sometimes too concise that it is hard to find where the information is, or often the important supplementary sentences are scattered many pages away. – xiver77 Apr 03 '22 at 12:30
3

`logical and concise` Och, you are overrating it. – KamilCuk Apr 03 '22 at 12:39
@KamilCuk: What the Standard "isn't" is *complete*. Many constructs are left undefined because implementations were expected to process them consistently, with or without a mandate, in the absence of any compelling reason to do otherwise, but the authors of the Standard didn't want to try to guess what reasons implementations might have for deviating from the commonplace behavior. – supercat Apr 04 '22 at 18:23
Is there anything in the Standard that would require a compiler, given `struct foo {uint32_t x[10];} s;` to regard the lvalue expression `s.x[4]` as having any association with type `struct foo`, but would not require a compiler given `*(uint32_t*)floatPtr` to recognize it as being associated with type `uint32_t`? The former expression is *defined* as meaning `(*((s.x)+(4)))`, which applies the `*` operator to a pointer of type `uint32_t*`, yielding an lvalue of type `uint32_t` which not among the types that may be used to access a `struct foo`. A compiler that isn't deliberately blind... – supercat Apr 04 '22 at 18:48
...would of course notice that the pointer was freshly formed in a manner that involved type `struct foo`, but a compiler that wasn't being deliberately blind should be equally capable of recognizing that `*(uint32_t*)floatPtr` dereferences a pointer that's freshly formed in a manner involving type `float`. – supercat Apr 04 '22 at 18:51

Peter Cordes · Answer 2 · 2022-04-04T04:14:09.087

As Kamil explains, it's UB. Even int and long (or long and long long) aren't alias-compatible even when they're the same size. (But interestingly, unsigned int is compatible with int)

It's nothing to do with being the same size, or using the same register-set as suggested in a comment, it's mainly a way to let compilers assume that different pointers don't point to overlapping memory when optimizing. They still have to support C99 union type-punning, not just memcpy. So for example a dst[i] = src[i] loop doesn't need to check for possible overlap when unrolling or vectorizing, if dst and src have different types.¹

If you're accessing the same integer data, the standard requires that you use the exact same type, modulo only things like signed vs. unsigned and const. Or that you use (unsigned) char*, which is like GNU C __attribute__((may_alias)).

The other part of your question seems to be why it appears to work in practice, despite the UB.
Your godbolt link forgot to link the actual compilers you tried.

https://godbolt.org/z/rvj3d4e4o shows GCC4.1, from before GCC went out of its way to support "obvious" local compile-time-visible cases like this, to sometimes not break people's buggy code using non-portable idioms like this. It loads garbage from stack memory, unless you use -fno-strict-aliasing to make it movd to that location first. (Store/reload instead of movd %xmm0, %eax is a missed-optimization bug that's been fixed in later GCC versions for most cases.)

f:     # GCC4.1 -O3
        movl    -4(%rsp), %eax
        ret

f:    # GCC4.1 -O3 -fno-strict-aliasing
        movss   %xmm0, -4(%rsp)
        movl    -4(%rsp), %eax
        ret

Even that old GCC version warns warning: dereferencing type-punned pointer will break strict-aliasing rules which should make it obvious that GCC notices this and does not consider it well-defined. Later GCC that do choose to support this code still warn.

It's debatable whether it's better to sometimes work in simple cases, but break other times, vs. always failing. But given that GCC -Wall does still warn about it, that's probably a good tradeoff between convenience for people dealing with legacy code or porting from MSVC. Another option would be to always break it unless people use -fno-strict-aliasing, which they should if dealing with codebases that depend on this behaviour.

Being UB doesn't mean required-to-fail

Just the opposite; it would take tons of extra work to actually trap on every signed overflow in the C abstract machine, for example, especially when optimizing stuff like 2 + c - 3 into c - 1. That's what gcc -fsanitize=undefined tries to do, adding x86 jo instructions after additions (except it still does constant-propagation so it's just adding -1, not detecting temporary overflow on INT_MAX. https://godbolt.org/z/WM9jGT3ac). And it seems strict-aliasing is not one of the kinds of UB it tries to detect at run time.

See also the clang blog article: What Every C Programmer Should Know About Undefined Behavior

An implementation is free to define behaviour the ISO C standard leaves undefined

For example, MSVC always defines this aliasing behaviour, like GCC/clang/ICC do with -fno-strict-aliasing. Of course, that doesn't change the fact that pure ISO C leaves it undefined.

It just means that on those specific C implementations, the code is guaranteed to work the way you want, rather than happening to do so by chance or by de-facto compiler behaviour if it's simple enough for modern GCC to recognize and do the more "friendly" thing.

Just like gcc -fwrapv for signed-integer overflows.

Footnote 1: example of strict-aliasing helping code-gen

#define QUALIFIER // restrict

void convert(float *QUALIFIER pf, const int *pi) {
    for(int i=0 ; i<10240 ; i++){
        pf[i] = pi[i];
    }
}

Godbolt shows that with the -O3 defaults for GCC11.2 for x86-64, we get just a SIMD loop with movdqu / cvtdq2ps / movups and loop overhead. With -O3 -fno-strict-aliasing, we get two versions of the loop, and an overlap check to see if we can run the scalar or the SIMD version.

Is there actual cases where strict aliasing helps better code generation, in which the same cannot be achieved with restrict

You might well have a pointer that might point into either of two int arrays, but definitely not at any float variable, so you can't use restrict on it. Strict-aliasing will let the compiler still avoid spill/reload of float objects around stores through the pointer, even if the float objects are global vars or otherwise aren't provably local to the function. (Escape analysis.)

Or a struct node * that definitely isn't the same type as the payload in a tree.

Also, most code doesn't use restrict all over the place. It could get quite cumbersome. Not just in loops, but in every function that deals with pointers to structs. And if you get it wrong and promise something that's not true, your code's broken.

The other answer answered my original question, but thanks for explaining the reason behind the rules. I can see now that it can save some load/stores in complex code where there are a lot of different `structs` all over. — xiver77, Apr 04 '22 at 04:24
Not only are a 64-bit `long` and 64-bit `long long` not alias-compatible, but in the dialect processed by gcc, the existence of code which would *if executed* access a `long` using a `long long*` is sufficient to cause nonsensical behavior *even if the code in question never actually executes*. — supercat, Apr 05 '22 at 15:26
Most code could and should use `restrict` if the Standard were to recognize it as a transitive directed relation rather than a weird context-sensitive equivalence relation in which the value of `p` used in a statement like `if (flag) {*p = 3;}` may not be "based upon" the value of `p` in the enclosing block. — supercat, Apr 05 '22 at 16:02

supercat · Answer 3 · 2022-04-06T15:05:48.667

The Standard was never intended to fully, accurately, and unambiguously partition programs that have defined behavior and those that don't(*), but instead relies upon compiler writers to exercise a certain amount of common sense.

(*) If it was intended for that purpose, it fails miserably, as evidenced by the amount of confusion stemming from it.

Consider the following two code snippets:

/* Assume suitable declarations of u are available everywhere */
union test { uint32_t ww[4]; float ff[4]; } u;

/* Snippet #1 */
uint32_t proc1(int i, int j)
{
  u.ww[i] = 1;
  u.ff[j] = 2.0f;
  return u.ww[i];
}

/* Snippet #2, part 1, in one compilation unit */
uint32_t proc2a(uint32_t *p1, float *p2)
{
  *p1 = 1;
  *p2 = 2.0f;
  return *p1;
}

/* Snippet #2, part 2, in another compilation unit */
uint32_t proc2(int i, int j)
{
  return proc2a(u.ww+i, u.ff+j);
}

It is clear that the authors of the Standard intended that the first version of the code be processed meaningfully on platforms where that would make sense, but it's also clear that at least some of the authors of C99 and later versions did not intend to require that the second version be processed likewise (some of the authors of C89 may have intended that the "strict aliasing rule" only apply to situations where a directly named object would be accessed via pointer of another type, as shown in the example given in the published Rationale; nothing in the Rationale suggests a desire to apply it more broadly).

On the other hand, the Standard defines the [] operator in such a fashion that proc1 is semantically equivalent to:

uint32_t proc3(int i, int j)
{
  *(u.ww+i) = 1;
  *(u.ff+j) = 2.0f;
  return *(u.ww+i);
}

and there's nothing in the Standard that would imply that proc() shouldn't have the same semantics. What gcc and clang seem to do is special-case the [] operator as having a different meaning from pointer dereferencing, but nothing in the Standard makes such a distinction. The only way to consistently interpret the Standard is to recognize that the form with [] falls in the category of actions which the Standard doesn't require that implementations process meaningfully, but relies upon them to handle anyway.

Constructs such as yours example of using a directly-cast pointer to access storage associated with an object of the original pointer's type fall in a similar category of constructs which at least some authors of the Standard likely expected (and would have demanded, if they didn't expect) that compilers would handle reliably, with or without a mandate, since there was no imaginable reason why a quality compiler would do otherwise. Since then, however, clang and gcc have evolved to defy such expectations. Even if clang and gcc would normally generate useful machine code for a function, they seek to perform aggressive inter-procedural optimizations that make it impossible to predict what constructs will be 100% reliable. Unlike some compilers which refrain from applying potential optimizing transforms unless they can prove that they are sound, clang and gcc seek to perform transforms that can't be proven to affect program behavior.

Why are you assigning `2.0f` to part of `uint16_t ff[4]`? I was expecting that union member to be a float array. Also, wouldn't that stop `u.ff+j` from being a valid `float*` arg for `proc2a`? So I'm guessing you started with two sizes of int, then changed to float but forgot the union. — Peter Cordes, Apr 05 '22 at 22:47
@PeterCordes: I'd been dithering between using `uint16_t[4]` and `uint32_t[4]` (to allow manipulating upper or lower half of a word separately), or between using `uint32_t[4]` and `float[4]`, but I sorta must have left a mix-and-match mess. — supercat, Apr 06 '22 at 15:04
@PeterCordes: Though accessing `float` values as `uint16_t` could still be somewhat useful on a few platforms like the 68000, if one wants to e.g. extract just he exponent part of a `float`. More important than the particulars of the example, however, is the principle that if the "strict aliasing rules" were omitted from the Standard, the behavior would be defined on platforms that define the values of all possible bit patterns, and in cases where a behavior would generally be defined but for optimization purposes it might be useful to allow corner-case behaviors to differ... — supercat, Apr 06 '22 at 15:11
...the Standard is intended to treat most questions of what corner cases to support as a "quality of implementation" issue outside its jurisdiction, but the authors of C89 unfortunately went out of their way to avoid any suggestion that some implementations might be "better" than others. — supercat, Apr 06 '22 at 15:12

Is it always undefined behaviour to copy the bits of a variable through an incompatible pointer?

3 Answers3

Being UB doesn't mean required-to-fail

An implementation is free to define behaviour the ISO C standard leaves undefined

Footnote 1: example of strict-aliasing helping code-gen

Linked