21
union A{
  int a;
  int b;
};
int main(){
  A u = {.a = 0};
  int r = u.b; // #1 Is this UB?
}

[class.union] says

In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended ([basic.life]). At most one of the non-static data members of an object of union type can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time.

In this example, only A::a is active, then [basic.life] p7 says

The program has undefined behavior if:

  • the glvalue is used to access the object, or

#1 tries to access the object whose lifetime has not begun. Does this access cause UB, if it is, is this requirement too restrictive?

BTW, Does the C standard impose the same requirement in this example? Or, Does C have a looser requirement in this case?

Update

In C standard, I find a note, which says

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called type punning). This might be a non-value representation.

This means it is permitted in C. However, https://eel.is/c++draft/diff.iso

Subclause [diff.iso] lists the differences between C++ and ISO C, in addition to those listed above, by the chapters of this document.

does not point out the difference.

xmh0511
  • 7,010
  • 1
  • 9
  • 36
  • [cppref](https://en.cppreference.com/w/cpp/language/union) has this note: "If two union members are standard-layout types, it's well-defined to examine their common subsequence on any compiler." Though I didnt bother to check where to find the corresponding section in the standard – 463035818_is_not_an_ai May 24 '23 at 12:28
  • 2
    @463035818_is_not_a_number No, they are not the same thing. The formal wording is [[class.mem.general] p26](https://eel.is/c++draft/class.mem#general-26), The precondition is such two variants should be struct type. Moreover, the concept "common initial sequence" is in terms of "standard-layout struct". – xmh0511 May 24 '23 at 12:32
  • ok. Seems like cppref is not quite accurate then, or the note is supposed to refer to the previous section which is about uninon members of class type – 463035818_is_not_an_ai May 24 '23 at 12:36
  • C allows type punning through a union, so I suspect it's legal C code. AFAIK this is UB per the standard but I've not seen a C++ compiler do the wrong thing when working with fundamental types. This gets more muddled in C++20 since `int` is an implicit lifetime type meaning `b` should be alive as storage exists for it. – NathanOliver May 24 '23 at 12:39
  • 1
    @NathanOliver See https://eel.is/c++draft/basic.life#1, you will notice how the lifetime of a variant of a union will begin(emphasized "only" in the quote). – xmh0511 May 24 '23 at 12:48
  • 3
    [diff.iso] is informative, not normative, so omission of a difference between C and C++ is presumptively an oversight, not an authoritative statement. Even if it were normative, it would be controlling only for C++ and would have no authority of the C standard. – Eric Postpischil May 24 '23 at 13:14
  • 2
    I think this is mostly correct still https://stackoverflow.com/questions/25664848/unions-and-type-punning, so maybe a duplicate. – Lundin May 24 '23 at 13:36
  • Btw the annex supposedly containing all differences between C and C++ is _laughable_. At some point I tried to compile a [list of C99-unique features also present in C++](https://stackoverflow.com/questions/47524553/are-all-of-the-features-of-c99-also-in-c/47526708#47526708) and that list alone has more bullets with various language quirks than the C++ annex. Why ISO C++ contains laughable, terribly outdated annexes is a different topic however... – Lundin May 24 '23 at 13:43
  • 2
    Ok since you have essentially answered the question yourself, I'll go ahead and close as dupe unless someone has new info as per the C23 C++23 releases...? Normative text in C++ is the quoted "At most one of the non-static data members of an object of union type can be active at any time", normative text in C is 6.5.2.3 "A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member" (the quoted foot note points at this text). And [diff.iso] is to be dismissed entirely. – Lundin May 24 '23 at 14:21
  • @NathanOliver fwiw the Qt project did have a bug where they had a union containing a pointer and a `uintptr_t`. They toggled the lower bit of the `uintptr_t` then called a method through the pointer. On some platforms, the method would be called with `this` keeping it old value. – spectras May 24 '23 at 14:32
  • At least Clang considers your code UB, and disallows it in constant expressions: https://gcc.godbolt.org/z/TrWvMf1Ph – Fedor May 24 '23 at 16:10
  • @Fedor However, GCC and MSVC do not render any error, as indicated https://gcc.godbolt.org/z/q41rMrxbE, which means there is some room of debate here. – xmh0511 May 25 '23 at 02:44
  • @xmh0511, GCC and MSVC render this error as well: https://gcc.godbolt.org/z/We4zYzdaa – Fedor May 25 '23 at 07:19
  • 1
    Although limited to struct members of a union whilst OPs example are fundamental types, [\[class.mem.general\]/25](https://timsong-cpp.github.io/cppwp/n4868/class.mem.general#25) arguably adds some confusion to this topic with its glvalue-allowed read with the effect _"as if the corresponding member [...] were nominated"_. – dfrib May 25 '23 at 16:40
  • @dfrib Yep, [class.mem.general]/25 only applies to two variants that are of struct types, see https://github.com/cplusplus/CWG/issues/135 – xmh0511 May 26 '23 at 07:15

2 Answers2

5

I think in C++ this is undefined behavior per the first standard quotation in the question itself: there is no exception there for the case of same types of data members.

And major compilers seem to agree with this reading, disallowing such code in constant expressions:

union A{
  int a;
  int b;
};

constexpr int f() {
  A u = {.a = 0};
  return u.b;
}

constexpr int x = f();

Online demo: https://gcc.godbolt.org/z/We4zYzdaa

Clang's error:

read of member 'b' of union with active member 'a' is not allowed in a constant expression

GCC's error:

accessing 'A::b' member instead of initialized 'A::a' member in constant expression

MSVC's error:

failure was caused by accessing a non-active member of a union
Fedor
  • 17,146
  • 13
  • 40
  • 131
  • 2
    Interestingly, all compilers also reject the very similar but explicitly allowed example of [\[class.mem.general\]/5](https://timsong-cpp.github.io/cppwp/n4868/class.mem.general#25). [DEMO](https://gcc.godbolt.org/z/von1seb4q). – dfrib May 25 '23 at 16:46
  • @dfrib the common initial sequence rule is not allowed in a constant expression context b/c it [does not change the active member](http://eel.is/c++draft/expr.const#5.10) – Shafik Yaghmour May 25 '23 at 21:21
  • @ShafikYaghmour That would explain it, thanks. Would the wording of class.mem.general, "[...] **the behavior is as if** the corresponding member of T1 were nominated." need some rewriting or clarification, or is it clear from somewhere that "as if" does not mean a special case that excludes that of expr.const/5.10? – dfrib May 26 '23 at 08:44
3

Does reading to another variant member of a union that has the same type as the active variant cause UB?

The rule is clear. One member is active. Reading inactive is UB.

Does this access cause UB

Yes.

, if it is, is this requirement too restrictive?

No idea. Subjective: Not for me. It is fine. I see no value in having two members of the same type in a union and then accessing them one or the other.

Does the C standard impose the same requirement in this example?

No. One of examples that C and C++ are different.

Does C have a looser requirement in this case?

Yes. In C, you can read any member from a union you want to. The requirement is that the assessed value will not be a trap representation.

Related: Unions and type-punning

KamilCuk
  • 120,984
  • 8
  • 59
  • 111