3

I believe 6.5p7 in the C standard defines the so-called strict aliasing rule as follows.

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  1. a type compatible with the effective type of the object,
  2. a qualified version of a type compatible with the effective type of the object,
  3. a type that is the signed or unsigned type corresponding to the effective type of the object,
  4. a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  5. an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  6. a character type.

Here's a simple example that shows GCC's optimization based on its assumption to the rule.

int IF(int *i, float *f) {
    *i = -1;
    *f = 0;
    return *i;
}

IF:
        mov     DWORD PTR [rdi], -1
        mov     eax, -1
        mov     DWORD PTR [rsi], 0x00000000
        ret

The load for return *i is omitted assuming that int and float cannot alias.

Then let's consider case 6, where it says an object could be accessed by a character type lvalue expression (char *).

int IC(int *i, char *c) {
    *i = -1;
    *c = 0;
    return *i;
}

IC:
        mov     DWORD PTR [rdi], -1
        mov     BYTE PTR [rsi], 0
        mov     eax, DWORD PTR [rdi]
        ret

Now there is a load for return *i because i and c could overlap according to the rules, and *c = 0 could change what's in *i.

Then can we also modify a char through an int *? Should the compiler care that such thing might happen?

char CI(char *c, int *i) {
    *c = -1;
    *i = 0;
    return *c;
}

CI: #GCC
        mov     BYTE PTR [rdi], -1
        mov     DWORD PTR [rsi], 0
        movzx   eax, BYTE PTR [rdi]
        ret

CI: #Clang
        mov     byte ptr [rdi], -1
        mov     dword ptr [rsi], 0
        mov     al, byte ptr [rdi]
        ret

Looking at the assembly output, both GCC and Clang seem to think a char can be modified by access through int *.

Maybe it's obvious that A and B overlapping means A overlaps B and B overlaps A. However, I found this detailed answer which emphasizes in boldface that,

Note that may_alias, like the char* aliasing rule, only goes one way: it is not guaranteed to be safe to use int32_t* to read a __m256. It might not even be safe to use float* to read a __m256. Just like it's not safe to do char buf[1024]; int *p = (int*)buf;.

Now I got really confused. The answer is also about GCC vector types, which has an may_alias attribute so it can alias similarly as a char.

At least, in the following example, GCC seems to think overlapping access can happen in both ways.

int IV(int *i, __m128i *v) {
    *i = -1;
    *v = _mm_setzero_si128();
    return *i;
}

__m128i VI(int *i, __m128i *v) {
    *v = _mm_set1_epi32(-1);
    *i = 0;
    return *v;
}

IV:
        pxor    xmm0, xmm0
        mov     DWORD PTR [rdi], -1
        movaps  XMMWORD PTR [rsi], xmm0
        mov     eax, DWORD PTR [rdi]
        ret
VI:
        pcmpeqd xmm0, xmm0
        movaps  XMMWORD PTR [rsi], xmm0
        mov     DWORD PTR [rdi], 0
        movdqa  xmm0, XMMWORD PTR [rsi]
        ret

https://godbolt.org/z/ab5EMx3bb

But am I missing something? Is strict aliasing one-way?


Additionally, after reading the current answers and comments, I thought maybe this code is not allowed by the standard.

typedef struct {int i;} S;
S s;
int *p = (int *)&s;
*p = 1;

Note that (int *)&s is different from &s.i. My current interpretation is that an object of type S is being accessed by an lvalue expression of type int, and this case is not listed in 6.5p7.

xiver77
  • 2,162
  • 1
  • 2
  • 12
  • The rule is definitely one-way, there are real-life examples of compilers breaking code that points an `int *` into an actual `__m256i` *object*, like [GCC AVX \_m256i cast to int array leads to wrong values](https://stackoverflow.com/q/71364764). But you're using a `__m128i *` *pointer* to point at memory that's allowed to be a different underlying type. Note that what you quoted from my answer gave a `char buf[1024]` example, a char-array object, no `char*` involved. (Accessing it may involve `char*` due to how `buff[i]` works as `*(buff+i)`, so that may be safer, unlike __m128i) – Peter Cordes May 25 '22 at 17:39
  • I'll update my linked answers to include that real-world breakage example. – Peter Cordes May 25 '22 at 17:44
  • @PeterCordes Similarly to the example in your link that breaks, would `struct {int i;} s = {1}; *(int *)&s = 0;` also possibly break? I know `s` and `s.i` must be in the same memory location if it is in memory, but the rules say it's only possible to access an `int` through `struct {int i;} *`, and not the other way. – xiver77 May 25 '22 at 18:26
  • Real compilers definitely allow that, and I think even ISO C allows you to derive a pointer to a member from a pointer to the whole aggregate, as long as you get the math right (the correct offset). So `*(int *)&s = 0;` is definitely fine. The `int` member of a struct is an `int` object, so that's allowed by rule (1) in your quote. – Peter Cordes May 25 '22 at 18:33
  • The other way is more interesting, `int i = 1;` / `*(struct s*)&i = {0};` I think that's what (5) is allowing, so unless there's a separate problem in the pointer casting, that may work. It may also work to point a `struct{int i[2];};` at something declared as `int arr[2]`, but that feels even weirder. – Peter Cordes May 25 '22 at 18:36
  • @PeterCordes The `int` member is an `int` object, but that's different from a `struct {int i;}` object. By `*(int *)&s = 0;`, you are accessing a `struct {int i;}` through `int *`. Not sure if that's okay. – xiver77 May 25 '22 at 18:40
  • The `int i` struct member *is* an `int` object. That's what makes it safe to pass `&s.i` to things that want an `int*`, and why that doesn't need any casting. The struct object and int member fully overlap in this case. – Peter Cordes May 25 '22 at 18:43
  • @PeterCordes `*(int *)&s = 0;` is the same as `typedef struct {int i;} S; S s; int *p = (int *)&s; *p = 0`. See that `(int *)&s` is different from `&s.i`. It's kind of an artificial example, but I want a clear understanding of the rules. – xiver77 May 25 '22 at 18:47
  • @PeterCordes My current interpretation is that by `int *p = (int *)&s; *p = 0;`, an object of type `S` is being accessed by an lvalue expression of type `int`. – xiver77 May 25 '22 at 18:50
  • Like I said, if there's any problem in deriving an `int*` by casting a `struct*`, it's not strict-aliasing. C defines enough about how addresses and memory works that there definitely is an `int` object in there somewhere, at some address between `&s` and `((char*)&s) + sizeof(s) - sizeof(int)`. If your implementation doesn't put padding before the first `int` member, then it's correct. (I think padding might be allowed, but on implementations that choose not to do that, I'm pretty sure everything is well-defined behaviour even in pure ISO C.) – Peter Cordes May 25 '22 at 18:52
  • There are rules about deriving pointers to sub-objects (e.g. to struct or array member), as opposed to subtracting the address of two different objects and then adding that to one of them. Deriving a pointer to a sub-object is allowed, that's why [C `offsetof` is a thing](https://en.cppreference.com/w/c/types/offsetof) and is implementable and usable. – Peter Cordes May 25 '22 at 18:57
  • @PeterCordes The address being the same is not the only issue because if so, `__m256i v = ...; int *p = (int *)&v;` wouldn't be any problem because `&v` is clearly *assigned* to `p`. But as you know the compiler lets garbage to be loaded. – xiver77 May 25 '22 at 19:00
  • 1
    `__m256i v` doesn't have an `int` member sub-object at that address, so point (1) of the strict-aliasing rule doesn't apply. Of course you have to respect strict-aliasing as well as other rules for deriving pointers. – Peter Cordes May 25 '22 at 19:01
  • @PeterCordes I see. Although I still cannot make some clear logical sense by the sentences of the standard, I think I get what the standard committee's intention is, and how GCC has implemented. – xiver77 May 25 '22 at 19:15
  • @PeterCordes Just to clarify, does the `may_alias` attribute for vector types imply that a vector object can *exist* in any contiguous array of any type? In the way that there are 4 `char`'s existing in a 32-bit `int` but an `int` doesn't exist in an array of 4 `char`s? I think this is the point you were trying to explain? – xiver77 May 25 '22 at 19:25
  • I don't think it's useful to think of an `int arr[256]` as also being composed of `char` or `__m256i` objects. Just that you can use pointers of that type to *access* the bytes of other objects. (The "object-representation"). Including a `struct { char c; short s[7];}` including padding. The mental model you suggest could I guess work. It gets hairy when you consider a `__attribute__((aligned(1),may_alias))` type (like you might use as an alternative to `memcpy` to do an unaligned aliasing-safe load or store of `uint32_t` to any offset of a char array). So there are overlapping objects... – Peter Cordes May 25 '22 at 19:31
  • @PeterCordes In case of `__attribute__((aligned(1), may_alias)) uint32_t` (`u32`), there'd be a `u32` in `0:3`, `1:4`, `2:5`, and so on. Anyway in my personal opinion, strict aliasing doesn't help much unless someone decides to write Java in C (GTK?). But such codebases won't be performance critical, and performance critical cases like OS or SIMD optimized computation often half-ignore aliasing. – xiver77 May 25 '22 at 19:46
  • @PeterCordes Could you have a look at [these examples](https://godbolt.org/z/4TEbq6E7P)? I made some short examples where `__m128i v; (int *)&v` and `(long *)&v` breaks while `(long long *)&v` works. So, if you're trying to access a vector of `long long` objects by `int *`, that's clearly not allowed. However, it seems GCC treats `char` very specially that `char ca[16]; (int *)&ca;` doesn't break in a similar use case, even with unaligned access. See the last two functions. – xiver77 May 26 '22 at 05:04
  • Undefined behaviour doesn't mean "guaranteed to break". It can easily happen to work; that doesn't prove anything. (Although the `long long*` case is interesting, and maybe isn't a coincidence that a type matching the vector works as expected, including doing all loads first before either call. It might be interesting to test that with `typedef short v8si __attribute__((vector_size(16)));` and see if `short*` is the only type that works as expected with it. – Peter Cordes May 26 '22 at 05:37
  • 1
    @PeterCordes: Under the abstraction model used in Dennis Ritchie's language, every region of addressable storage simultaneously contains, throughout its lifetime, objects of every type that could fit therein, given their size and alignment constraints, but when N1570 6.5p6 and 6.5p7 use the term "object" they must mean something else, but it's not clear what. – supercat May 26 '22 at 22:24

3 Answers3

3

Yes it's only one way, but from the context of the function it can't tell from which side.

Given this:

char CI(char *c, int *i) {
    *c = -1;
    *i = 0;
    return *c;
}

It could have been called like this:

int a;
char *p = ((char *)&a) + 1;
char b = CI(p,&a);

Which is a valid use of aliasing. So from inside of the function, *i = 0 is correctly setting a in the calling function, and *c = -1 is correctly setting one byte inside of a.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • If you don't mind, please have a look at the added part of my question. – xiver77 May 25 '22 at 18:56
  • The Standard was written at a time when it would have been essentially impossible for compilers not to offer such semantics, and thus there was no need to mandate them. Today, however, compilers go out of their way to use whole-program optimization to avoid performing any reloads not mandated by the Standard. – supercat Aug 06 '22 at 22:44
2

You can take a pointer to any object, cast it to a char* and use that to access the bit patterns underlying said object. You can also cast char* gotten this way back to it's original type.

So when the compiler sees int *i and char *p it can not exclude the possibility that p was created by casting from i. So they may point to the same raw memory. Changing one may change the other. There it goes both ways. But that is not what the text is about.

What this is about is casting from A* to char* and then to B*. The object pointed to doesn't magically become a B and accessing it through a B* is undefined behavior. Maybe one-way is the wrong word. I don't know what to name this better. But for every object there is a train with only 2 stops: A* and char* (unsigned char*, signed char*, const char*, ... and all it's variants). You can go back and forth as many times as you like but you can never change tracks and go to B*.

Does that help?

The may_alias attribute sets up another such rail system. Allowing the alias between int[4] and __m128i* because that is exactly the overlapping the compiler needs for the vectorization. But that's something you have to look up in the compilers specs.

Goswin von Brederlow
  • 11,875
  • 2
  • 24
  • 42
  • This question is also asking about `__m128i` which is defined as `typedef long long __m128i __attribute__((may_alias,vector_size(16)))`. So that's another type in addition to `char*` you can cast through. And IMO the key point it's missing (which I was making in the parts of my answers it quoted) as about having an actual declared `__m128i vec` *object*, and pointing other pointer types into it. Not just dereferencing pointers. If the object is anonymous and only existed via pointer derefs, then pointing an `int*` at it is safe if that's the only type other than `char*` and `__m128i*`. – Peter Cordes May 25 '22 at 18:04
  • You can point an `int*` at it but you can't do anything with it. There can't be any aliasing because you can't access what the `int*` points to. – Goswin von Brederlow May 25 '22 at 18:13
  • I meant create and dereference a pointer, in limited comment space. I should have been more specific since your answer is focusing on casting to one pointer type and then back. I was mostly looking at the part of the question that misinterpreted a quote from one of my answers. – Peter Cordes May 25 '22 at 18:19
  • And that casting back is the crucial point. An `int*` and a `__m128i*` can never alias since you can't legally get from one to the other. Both can go to `char*` but no further. Each can only go back to it's original type. So if you ever get `int*` and `__m128i*` to point at the same address that you already violated the standard and have UB and we don't care what happens then. – Goswin von Brederlow May 25 '22 at 18:24
  • Your answer definitely helped me understanding what's going on, but I'm still not clear about cases such as `struct {int i;} s = {1}; *(int *)&s = 0;`. I know `s` and `s.i` must be in the same memory location if it is in memory, but the rules say it's only possible to access an `int` through `struct {int i;} *`, and not the other way, so is the example possibly broken? – xiver77 May 25 '22 at 18:30
  • @GoswinvonBrederlow: As I said in my first comment, `__m128i` is a "may_alias" type; it's explicitly allowed to point a `__m128i*` at an `int arr[4]` (or a `short arr[8]`, or a struct or anything else). This required behaviour is part of Intel's intrinsics API; any compiler providing `__m128i*` must support that somehow. The `may_alias` attribute is how GNU C compatible compilers do it; MSVC allows all aliasing. See [Is \`reinterpret\_cast\`ing between hardware SIMD vector pointer and the corresponding type an undefined behavior?](https://stackoverflow.com/q/52112605) – Peter Cordes May 25 '22 at 18:39
  • @xiver77 I would always write `&s.i` instead of `(int *)&s` to get the int because that makes it clear what I'm pointing at. It's easier to read and if I ever re-arange the members of the struct the `&s.i` will still work. But `(int *)&s` seems to be perfectly legal and the compiler knows that `&s` and `&s.i` are the same address and allows this. `(float*)&s` on the other hand says: *Warning: dereferencing type-punned pointer will break strict-aliasing rules*. But no idea where that is codyfied in the standard. – Goswin von Brederlow May 25 '22 at 18:39
  • @GoswinvonBrederlow: The same thing goes for all `__m...` vector types, including pointing float vectors at integer data. – Peter Cordes May 25 '22 at 18:42
  • @PeterCordes sorry, yes you are right. Bad choice of type on my part. I've extended my answer. – Goswin von Brederlow May 25 '22 at 18:50
  • @GoswinvonBrederlow "casting from `A*` to `char*`" --> to `unsigned char *` has the advantage that there is never any trap, nor padding and implementation defined behavior is out of range assignments. – chux - Reinstate Monica May 25 '22 at 20:32
  • @chux-ReinstateMonica All the char pointer variations work, `std::byte` too and is the recommended way I believe. – Goswin von Brederlow May 25 '22 at 21:03
  • @GoswinvonBrederlow Note post is tagged `[c]` so `std::byte` doe snot well apply here. All the char pointer variations do not work, less common ones have trouble, hence the suggestion. – chux - Reinstate Monica May 25 '22 at 22:09
  • @chux-ReinstateMonica: Have there ever been any non-two's-complement systems where loading a signed character would be faster than loading an unsigned one? If any such systems existed, people familiar with the architecture would be better qualified than the Committee to judge the pros and cons of having a `char` type with fewer distinct values than `unsigned char`, and if none ever existed any time spent by the Committee trying to decide how one should behave would be wasted. The question shouldn't matter for anyone whose code would never run on such a system, however. – supercat May 26 '22 at 17:57
  • @supercat The problem with `char` is that it can be signed or unsigned. Nothing to do with two's-complement. – Goswin von Brederlow May 26 '22 at 19:05
  • @GoswinvonBrederlow: The Standard would not forbid a ones'-complement implementation from having `char` default to signed, but unless any implementations actually did so, or there was some plausible reason to expect that a future implementation might do so, any effort spent dealing with the possibility of such implementations would be wasted. Note that on ones'-complement systems, conversion of a short signed type to unsigned and back would yield "surprising" behaviors no matter how an implementation chose to process it. – supercat May 26 '22 at 19:21
  • @supercat We already have system where `char` is signed and systems where is is unsigned. All with two's-complement. And any arithmetic with `char` will first get promoted to `int` and then you get values -128 to 127 or 0 to 255 depending on the architecture. That is the problem. That a one's-complement system with signed char would have -127 to -0 to 0 to 127 doesn't add anything. – Goswin von Brederlow May 26 '22 at 19:36
  • @GoswinvonBrederlow: On a two's-complement system, whether `char` is signed or unsigned, converting a `char` value to either `int` or `unsigned int`, performing a bitwise operation with a signed or unsigned value in the range 0..CHAR_MAX, and then converting it back to `char` will yield the same value whether whether some or all of the values involved were unsigned. On a one's-complement system, that would hold if `char` is unsigned, but not if it's signed. If `char` is signed, bitwise operators will behave weirdly. – supercat May 26 '22 at 19:59
  • @supercat Except it will not be in that range because char can be signed or unsigned. – Goswin von Brederlow May 26 '22 at 20:05
  • @GoswinvonBrederlow: The `char` type can only be signed on a particular platform if somebody has written a compiler for that platform where `char` is signed. Have any such compilers ever existed for ones'-complement systems, or is anyone ever going to write one? If not, then `char` can't possibly be signed on such systems. – supercat May 26 '22 at 20:14
  • gcc is such a compiler and clang. They even have command line arguments to override the platforms native signedness of char to something else if you don't like the native one. As said depending on the platform char is signed or unsigned. It's a real things that happens with two's-complement while one's-complement is so dead c++ is dropping it and stdint.h doesn't support it on C either. – Goswin von Brederlow May 26 '22 at 20:26
  • @GoswinvonBrederlow: Have gcc and clang ever supported ones'-complement platforms? – supercat May 26 '22 at 20:31
  • @supercat: GCC documents that signed integer types use 2's complement, as its choice for that C implementation-defined behaviour. (https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation). I assume this goes back pretty far; the git/svn history for that file goes back to 2004 (https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d592f1c398da37535d3c528d1a511a32d8b9b3ba), when the text was moved from extend.texi. Following that back, it said the same thing in 2002, but GCC was new in the 1980s. Still, I wouldn't be surprised if GCC has only ever had 2's complement. – Peter Cordes Aug 06 '22 at 00:19
  • @PeterCordes: That returns then to my original question: has anyone ever written a compiler for any ones'-complement system where `char` was a signed type? Not all combinations of things that would be allowable under the Standard make sense, and the Standard generally doesn't bother to forbid nonsensical combinations of things that could make sense individually. The fact that C17 and before allow `char` to be signed, and allow ones'-complement, does not imply that they should be read as demanding that portable code accommodate a ones'-complement dialects where `char` is signed. – supercat Aug 06 '22 at 15:40
  • @PeterCordes: I'm somewhat curious about what advantage there is to having C recognize specific numeric representations, versus specifying that implementations must specify whether or not their numeric representations have certain traits, such as all-bits-zero representing zero, zero always being represented by all-bits zero, positive and negative ranges being symmetric or off-by-one symmetric, having power-of-two mod wrapping, etc. I can imagine designs where it would be advantageous to specify the range of `long long` as -0x7FFFFFFEFFFFFFF to 0x7FFFFFFFFFFFFFFF... – supercat Aug 06 '22 at 15:53
  • ...so as to allow all values whose upper word is 0x80000000 to be treated as a NaN (signed overflows would yield such values, and operations using such values would yield such values as results). If a 32-bit processor had arithmetic instructions with such semantics, that would make it very easy for such an implementation to specify overflow behavior in a way that would make it easy to ensure that no undetected overflows could produce seemingly-valid results. Requiring that long long be symmetric, however, rather than merely requiring a warning macro if it isn't, would preclude such a design. – supercat Aug 06 '22 at 15:56
  • @supercat Why you could build a processor following that logic it would be far from optimal. There would be no benefit for it to offset the extra cost involved. You are reserving a huge chunk of the range of long long with only one meaning: invalid. You can do that with far fewer reserved bit patterns. – Goswin von Brederlow Aug 06 '22 at 20:50
  • @GoswinvonBrederlow: For a 32-bit processor, there would be one reserved bit pattern: 0x80000000, and having separate instructions for normal and sticky-NaN arithmetic would essentially as useful as having overflow-trapping and non-trapping instructions, but much cheaper. Having the behavior of the lower word of addition, subtraction, or multiplication depend upon the value of the upper bit would make all such operations much slower than having only the upper word processed in "special" fashion. – supercat Aug 06 '22 at 22:41
  • @supercat -0x7FFFFFF**E**FFFFFFF leaves a much larger reserved space. Or is that a typo? It sounds like you want one's-complement or sign+magnitude with the exception that `-0` stands for NaN. It's probably irrelevant in modern CPU monsters but it does add a lot of gates to adders. A lot more than connecting the overflow bit coming from the adder to the trap bit with an AND gate. It's just in comparison for multiplication or, even more so, division units that the extra cost becomes relatively small. It's on par with saturating add/sub and some CPUs do have that. – Goswin von Brederlow Aug 07 '22 at 17:30
  • @GoswinvonBrederlow: For applications that would require that integer overflow not occur silently, the logic required to have an integer overflow, or any computation with an operand of 0x80000000, force a result of 0x80000000 would be comparable to the cost of saturating addition, and less than the cost required to perform an in-order trap while offering more useful semantics (e.g. a control system which is supposed to compute a value several ways using different inputs could treat computations from inputs that overflow as invalid, using other computations, more easily than if overflows trap. – supercat Aug 07 '22 at 20:46
  • @supercat An overflow trap is really the simplest thing to generate in hardware. It's horrible when you want to catch it in software and for example turn it info the next carry for a long addition. But generating it is way way cheaper (in gate count) than saturated addition or the sticky 0x80000000. – Goswin von Brederlow Aug 07 '22 at 20:52
  • @GoswinvonBrederlow: Overflow traps are simple in a sequential-execution machine. They get much more complicated if one adds in out-of-order execution, and more complicated still if one adds in speculative execution. Forcing the output of an operation to 0x80000000 if an overflow occurs or either input is 0x80000000 could be done in a manner completely agnostic to execution order. – supercat Aug 08 '22 at 14:59
  • @GoswinvonBrederlow: I see no reason that having overflow force the bit pattern to 0x80000000 would be any harder than saturating arithmetic--something that some platforms implementations already support, since there's only one bit pattern to force rather than two, and having all operations involving an 0x80000000 operand treated as overflowing would simply require detecting that pattern in parallel with performing the rest of the arithmetic. There's a chicken-and-egg problem between language and hardware semantics, but having an integer NaN concept analogous to floating-point NaN would... – supercat Aug 08 '22 at 15:30
  • ...offer essentially the same advantages. Requiring that every integer NaN have the same bit pattern would mean that computations involving two-word types would need to be performed in full before writing either word of the result, rather than allowing e..g someLongLong++ to be performed by incrementing the bottom word, testing for zero, and writing it, and then only bothering to even load the top word if a carry was generated from the bottom word. – supercat Aug 08 '22 at 15:37
  • @supercat The difference between saturating math and NaN is that saturating math isn't sticky. `0x40000000 + 0x40000000 - 0x40000000` gives `0x40000000` with saturating math but should give `NaN`. You have to check both inputs for `NaN` and hen force the output. In gate counts that's roughly 3 times the cost. You see the same problem with your someLongLong example. Suddenly you have to check all words of the someLongLong before doing ++. For big integer or arbitrary precision math you probably want to use a control header containing the sign bit, a NaN bit, size and whatever and a magnitude. – Goswin von Brederlow Aug 09 '22 at 06:58
  • @GoswinvonBrederlow: If any value whose upper word is 0x80000000 is a NaN, then there is no need to check all the words before doing a ++, since performing an increment on values 0x80000000:00000000 through 0x80000000:FFFFFFFE without looking at the upper word would naturally yield results where the high word was 0x80000000. Incrementing 0x80000000:FFFFFFFF would cause the bottom half to be reset to zero while the upper half was "sticky-NaN" incremented yielding 0x80000000. – supercat Aug 09 '22 at 15:19
2

To understand how the "Strict Aliasing Rule" applies in any particular situation, one must define two concepts which are referenced in N1570 6.5p7 but not actually defined within the Standard:

  1. For purposes of N1570 6.5p7, under what circumstances is a region of storage considered to contain an object of any particular type? In particular for your use case, what does it mean for something to be 'copied as an array of character type'?

  2. What does it mean for an object to be accessed "by" an lvalue of a particular type?

There has never been a consensus as to how those concepts should be specified, thus making it impossible to for anyone to know the rules "mean"(*). The Standard seems to be intended to unambiguously support scenarios where a region of storage is created via malloc() or other such means, then written exclusively using character types, and then accessed via one other type, or those in which storage is written exclusively using one non-character type and then read exclusively via character types, but other scenarios are a bit murkier.

More significantly, while clang and gcc support those scenarios using character types, the sets of scenarios accommodated by clang and gcc omit some corner cases where the Standard is unambiguous, but which don't fit the abstraction model used by clang and gcc. Regardless of what the rules say, programmers should expect that the -fstrict-aliasing dialects of clang and gcc do not accommodate the possibility that storage which has ever been accessed via any non-character type might be accessed by any other within its lifetime, even if storage is always read using the last type with which it was written.

(*) In fairness to the authors of the Standard, a construct like:

unsigned test(float *fp) { return *(unsigned*)fp; }

would be equally usable on an implementation that ignores the possibility that the access via the pointer might affect something of type float but is agnostic as to how the pointer's target storage might be used outside the function, or on an implementation that does more detailed flow analysis but notices that the pointer value being dereferenced is derived from a float*. Unfortunately, if the Standard were to recognize that quality implementations should answer the second question at least as broadly as the first, that might be seen as implying that the authors of clang and gcc have been demanding the right to produce poor quality implementations.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • "Under what circumstances is a region of storage considered to contain an object of any particular type?" After looking at GCC's assembly output for several cases of aliasing, I think GCC considers that an object of a particular type exists at a certain location accessible through a valid pointer, after an assignment *by* an lvalue expression of that type (`A`), and this can be overwritten by another assignment by an lvalue expression of a different type (`B`), after which one cannot safely assume that an object of type `A` still exist in the same area. – xiver77 May 26 '22 at 08:02
  • @xiver77: So far as I can tell, both clang and gcc operate under a model where an action that can be shown to cause a region of storage to hold a bit pattern it has held at some earlier moment in time may cause the Effective Type of the storage to revert to the earlier type, *even if the bit pattern was written by a different type*. – supercat May 26 '22 at 14:35