3

This is an strict aliasing question, as in will the compiler cause any optimization order problems with this.

Say that I have three public floats in a struct XMFLOAT3 (not unlike this one.) And I want to cast to a float*. Will this land me in optimization trouble?

XMFLOAT3 foo = {1.0f, 2.0f, 3.0f};
auto bar = &foo.x;

bar[2] += 5.0f;
foo.z += 5.0f;
cout << foo.z;

I assume this will always print "13". But what about this code:

XMFLOAT3 foo = {1.0f, 2.0f, 3.0f};
auto bar = reinterpret_cast<float*>(&foo);

bar[2] += 5.0f;
foo.z += 5.0f;
cout << foo.z;

I believe this is legal because, according to http://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing

T2 is an aggregate type or a union type which holds one of the aforementioned types as an element or non-static member (including, recursively, elements of subaggregates and non-static data members of the contained unions): this makes it safe to cast from the first member of a struct and from an element of a union to the struct/union that contains it.

Is my understanding of this correct?

Obviously this will become implementation dependent on the declaration of XMFLOAT3.

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • What makes you think this code violates strict aliasing rules? – Lightness Races in Orbit Mar 26 '15 at 12:04
  • This is hopefully the last question helping me clear up aliasing from the series: http://stackoverflow.com/q/29121176/2642059 and http://stackoverflow.com/q/28697626/2642059 – Jonathan Mee Mar 26 '15 at 12:05
  • @LightnessRacesinOrbit I believe that it does not violate them. I would like confirmation. – Jonathan Mee Mar 26 '15 at 12:06
  • 1
    Isn't the problem that the struct may contain padding? It would be a devious move from the compiler to put in actual padding, but the optimizer may work on the assumption that such padding is undetectable by correct code. Also, `(&foo.x)[2]` looks like a plain out-of-bounds array access, which is obvious to the compiler. – MSalters Mar 26 '15 at 12:16
  • @MSalters `(&foo.x)[2]` should access `foo.z`, which is not an out-of-bounds access. As far as padding... I dunno, I'd love to learn more though if this is going to be a problem. Could you post a link or something? – Jonathan Mee Mar 26 '15 at 12:25
  • 1
    @MSalters, yes, there could be padding, although in theory and in practice padding would only be added for alignment purposes, and three adjacent floats members would be aligned just the same as the three elements of a `float[3]`, so it would be devious indeed. The `(&foo.x)[2]` is equivalent to `*(&foo.x + 2)` and 3.9.2/3 makes that well-formed as long as there really is a `float` at that address, which comes back to padding and alignment again. – Jonathan Wakely Mar 26 '15 at 12:27
  • @JonathanMee: `z` is not a member of `x`. The bounds of `foo.x` are as if `x` is a `float[1]`. That means `(&foo.x)[index]` is an out-of-bounds access for `index > 0`. Now on a typical x86 with a typical compiler, you indeed expect it to access `z` but that's not guaranteed. – MSalters Mar 26 '15 at 12:29
  • @JonathanWakely: In theory a compiler may add padding for any reason whatsoever. There is no restriction in the Standard. – MSalters Mar 26 '15 at 12:40
  • 1
    @MSalters It seems that the `static_assert` in [Jonathan Wakely](http://stackoverflow.com/users/981959/jonathan-wakely)'s answer could be used to protect against hare-brained compilers. (No offense to rodents intended.) – Jonathan Mee Mar 26 '15 at 12:48
  • @MSalters, [class.mem]/13 is a normative statement about adding padding between adjacent data members, which says it might be done for alignment, and we know that objects of the same type can be located adjacent to each other without padding because that is required by for arrays. If two adjacent floats needed padding for alignment that padding would be included in `sizeof(float)` already (by [expr.sizeof]/2). In practice platform ABIs provide stronger, more explicit guarantees. – Jonathan Wakely Mar 26 '15 at 13:00
  • 1
    @JonathanWakely: "It might be done for alignment" is not an exhaustive list. If the intent was to allow padding only for alignment, the statement would have been roughly "There shall be no initial padding. There shall be no padding anywhere else except to satisfy the alignment requirement of the member directly following such padding". – MSalters Mar 26 '15 at 13:08
  • @BaummitAugen This is definitely not a duplicate as I'm asking if I can use array indexing from the first element. Please reopen. – Jonathan Mee Aug 27 '17 at 15:33
  • 1
    @JonathanMee Added a dupe for that index from first member thing. – Baum mit Augen Aug 27 '17 at 15:54
  • @BaummitAugen Yeah-- the first dupe asks if it's legal to cast a `struct` to a pointer to it's first member. Note that in the question I provide a citation saying that in this case that is legal. But yeah this is a duplicate of the second question, thanks. – Jonathan Mee Aug 27 '17 at 18:59

3 Answers3

5

The reinterpret_cast from XMFLOAT3* to float* is OK, due to:

9.2 [class.mem] paragraph 20:

If a standard-layout class object has any non-static data members, its address is the same as the address of its first non-static data member. Otherwise, its address is the same as the address of its first base class subobject (if any). [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. — end note ]

That means the address of the first member is the address of the struct, and there's no aliasing involved when you access *bar because you're accessing a float through an lvalue of type float, which is fine.

But the cast is also unnecessary, it's equivalent to the first version:

auto bar = &foo.x;

The expression bar[2] is only OK if there is no padding between the members of the struct, or more precisely, if the layout of the data members is the same as an array float[3], in which case 3.9.2 [basic.compound] paragraph 3 says it is OK:

A valid value of an object pointer type represents either the address of a byte in memory (1.7) or a null pointer (4.10). If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained.

In practice there is no reason that three adjacent non-static data members of the same type would not be laid out identically to an array (and I think the Itanium ABI guarantees it), but to be safe you could add:

 static_assert(sizeof(XMFLOAT3)==sizeof(float[3]),
     "XMFLOAT3 layout must be compatible with float[3]");

Or to be paranoid, or if there are just additional members after z:

 static_assert(offsetof(XMFLOAT3, y)==sizeof(float)
               && offsetof(XMFLOAT3, z)==sizeof(float)*2,
     "XMFLOAT3 layout must be compatible with float[3]");

Obviously this will become implementation dependent on the declaration of XMFLOAT3.

Yes, it relies on it being a standard-layout class type, and on the order and type of its data members.

Jonathan Wakely
  • 166,810
  • 27
  • 341
  • 521
  • Excellent suggestion on the `static_assert`. I appreciate the heads up. – Jonathan Mee Mar 26 '15 at 12:45
  • On Meta, the OP [expressed concern](https://meta.stackoverflow.com/q/387008/1709587) that this answer would get less visibility than the answers on the similar questions this question has been closed as a duplicate of, despite being (in his opinion) far superior to the answers on the duplicates. With that in mind, you might wish to post a similar answer on one or both of the dupes. – Mark Amery May 05 '21 at 20:34
0

It's completely valid; this has nothing to do with strict aliasing whatsoever.

Strict aliasing rules require that pointers aliasing each other have compatible types;
clearly, float* is compatible with float*.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
-1

Consider a reasonably smart compiler:

XMFLOAT3 foo = {1.0f, 2.0f, 3.0f}; 
auto bar = &foo.x;

bar[2] += 5.0f;
foo.z += 5.0f; // Since no previous expression referenced .z, I know .z==8.0
cout << foo.z; // So optimize this to a hardcoded cout << 8.0f

Replacing variable accesses and operations by known results is a common optimization. Here the optimizer sees three uses of .z : the initial assignment, the increment and the final use. It can trivially determine the values at these three points, and substitute those.

Because struct members cannot overlap (unlike unions), bar which is derived from .x cannot overlap .z so .bar[2] cannot affect .z.

As you see, a perfectly normal optimizer can produce the "wrong" result.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • What you're describing is [Strict Aliasing](http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html) That's what the question is about, but I believe that the compiler is not allowed to reorder `bar[2] += 5.0f;` after the `cout` because of the quote that I have in my question. – Jonathan Mee Mar 26 '15 at 12:44
  • No, what I'm describing is Out of Bound array access. Strict aliasing is when `union { int x; float z } foo` lets `(&foo.x)[0]` and `foo.z` overlap. This no longer is an out-of bounds access, but it now becomes a strict aliasing violation. For strict aliasing to occur, you first need a **valid** expression to refer to memory holding an object of another incompatible type. `bar[2]` simply is not valid. – MSalters Mar 26 '15 at 13:00
  • @JonathanMee: And as for the "reordering", I was supposing that the compiler optimized out the entire assignment to `bar[2]` because no further code depends on the value of `bar` or `x`. – MSalters Mar 26 '15 at 13:01
  • Why doesn't 3.9.2 [basic.compound] paragraph 3 apply to `bar[2]`? The note seems pretty relevant: _"[ Note: For instance, the address one past the end of an array (5.7) would be considered to point to an unrelated object of the array’s element type that might be located at that address. ...]"_ Are you suggesting it's OK to go out-of-bounds by one element but not two? – Jonathan Wakely Mar 26 '15 at 13:05
  • @JonathanWakely: Forming an address directly after an object is allowed. But you may not read from or write to that address, that would still be an out-of-bounds access. +2 is right out. – MSalters Mar 26 '15 at 14:00
  • Nope, 3.9.2 says nothing about reading or writing. It says `&foo.x+2` can point to another object of the same type that happens to be at that location. So if there is no padding (which can be proven with a static assertion) and `foo.z` is at that address, then `*(&foo.x+2)` accesses it, and so `bar[2]` is fine. 3.9.2 does not talk about forming an address that can't be dereferenced, it says the pointer **points to** an object **regardless of how the value was obtained.** – Jonathan Wakely Mar 26 '15 at 14:09
  • See also http://open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#73 and [expr.eq]/2. If `&foo.z == &foo.x+2` then the pointers are equal, and both point to the same object, and it is wrong to say you can dereference one and not the other. Unlike relational operators there is no restriction on equality comparisons for pointers to objects that are not subobjects of the same object, and in this case they _are_ subobjects of the same object anyway. – Jonathan Wakely Mar 26 '15 at 14:16
  • @JonathanWakely: You've already entered UB territory by `&foo.z == &foo.x+2` (although as noted +1 is OK, because `&foo.x+1` is a pointer directly after x) – MSalters Mar 26 '15 at 14:54
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/73860/discussion-between-jonathan-wakely-and-msalters). – Jonathan Wakely Mar 26 '15 at 14:55