Background
Discussions on the mostly un-or-implementation-defined nature of type-punning via a union
typically quote the following bits, here via @ecatmur ( https://stackoverflow.com/a/31557852/2757035 ), on an exemption for standard-layout struct
s having a "common initial sequence" of member types:
C11 (6.5.2.3 Structure and union members; Semantics):
[...] if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.
C++03 ([class.mem]/16):
If a POD-union contains two or more POD-structs that share a common initial sequence, and if the POD-union object currently contains one of these POD-structs, it is permitted to inspect the common initial part of any of them. Two POD-structs share a common initial sequence if corresponding members have layout-compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.
Other versions of the two standards have similar language; since C++11 the terminology used is standard-layout rather than POD.
Since no reinterpretation is required, this isn't really type-punning, just name substitution applied to union
member accesses. A proposal for C++17 (the infamous P0137R1) makes this explicit using language like 'the access is as if the other struct member was nominated'.
But please note the bold - "anywhere that a declaration of the completed type of the union is visible" - a clause that exists in C11 but nowhere in C++ drafts for 2003, 2011, or 2014 (all nearly identical, but later versions replace "POD" with the new term standard layout). In any case, the 'visible declaration of union
type bit is totally absent in the corresponding section of any C++ standard.
@loop and @Mints97, here - https://stackoverflow.com/a/28528989/2757035 - show that this line was also absent in C89, first appearing in C99 and remaining in C since then (though, again, never filtering through to C++).
Standards discussions around this
[snipped - see my answer]
Questions
From this, then, my questions were:
What does this mean? What is classed as a 'visible declaration'? Was this clause intended to narrow down - or expand up - the range of contexts in which such 'punning' has defined behaviour?
Are we to assume that this omission in C++ is very deliberate?
What is the reason for C++ differing from C? Did C++ just 'inherit' this from C89 and then either decide - or worse, forget - to update alongside C99?
If the difference is intentional, then what benefits or drawbacks are there to the 2 different treatments in C vs C++?
What, if any, interesting ramifications does it have at compile- or runtime? For example, @ecatmur, in a comment replying to my pointing this out on his original answer (link as above), speculated as follows.
I'd imagine it permits more aggressive optimization; C can assume that function arguments
S* s
andT* t
do not alias even if they share a common initial sequence as long as nounion { S; T; }
is in view, while C++ can make that assumption only at link time. Might be worth asking a separate question about that difference.
Well, here I am, asking! I'm very interested in any thoughts about this, especially: other relevant parts of the (either) Standard, quotes from committee members or other esteemed commentators, insights from developers who might have noticed a practical difference due to this - assuming any compiler even bothers to enforce C's added clause - and etc. The aim is to generate a useful catalogue of relevant facts about this C clause and its (intentional or not) omission from C++. So, let's go!