5

I was wondering if the following was valid C++:

union id {
    struct {
        std::uint32_t generation : 8;
        std::uint32_t index : 24;
    };
    std::uint32_t value;
};

I want this so I can access both generation and index separately, which keeping access to the whole number. Since they all are std::uint32_t, this shouldn't be UB right?

I plan to use it like that:

auto my_id = id{
    .generation = 1,
    .index = 4,
};

auto my_id_value = std::uint32_t{id.value};

If it is UB, is there another way to make this work and valid according to the C++ standard?

timrau
  • 22,578
  • 4
  • 51
  • 64
Guillaume Racicot
  • 39,621
  • 9
  • 77
  • 141
  • _"...Two standard-layout non-union class types may have a common initial sequence of non-static data members and __bit-fields__,..."_ https://en.cppreference.com/w/cpp/language/data_members – Richard Critten Jun 21 '22 at 14:38
  • Did [this](https://stackoverflow.com/questions/2253878/why-does-c-disallow-anonymous-structs) change with C++17/20, by the way? Didn't notice at least... – Aconcagua Jun 21 '22 at 14:45
  • @Aconcagua hmm, I totally forgot this was a language extension. The same question applies though. – Guillaume Racicot Jun 21 '22 at 14:47
  • @RichardCritten find me the quote in the C++ standard and I'll be satisfied! – Guillaume Racicot Jun 21 '22 at 14:47
  • *'In a standard-layout union with an active member of non-union class type T1, it is permitted to read a non-static data member m of another union member of non-union class type T2 provided m is part of the common initial sequence of T1 and T2 (except that reading a volatile member through non-volatile glvalue is undefined).'* – *this* seems to be the relevant phrase... – Aconcagua Jun 21 '22 at 14:48
  • @Aconcagua yeah it is! But does it applies to bitfields in the way I posted in the question? – Guillaume Racicot Jun 21 '22 at 14:49
  • I cannot imagine differently – there needs to be *at least* one underlying `uint32_t` within the struct to cover the bitfield, and this one would be the common initial sequence... If I recall correctly (again) the compiler is *not* allowed to condense the bitfield to a smaller type if there are only that many bits actually used so that it would fit. – Aconcagua Jun 21 '22 at 14:54
  • One thing to note (see [cppreference](https://en.cppreference.com/w/cpp/language/bit_field)): *'Multiple adjacent bit-fields are usually packed together (although this behavior is implementation-defined)'* – at least in theory this is not fully portable, even though I'm not aware of any compiler not doing so. In case of serialisation we might have yet another issue as the order in which bitfields are placed into is again implementation defined. – Aconcagua Jun 21 '22 at 15:00
  • Current draft - [class.mem.general#23](https://eel.is/c++draft/class.mem.general#23) - _"The common initial sequence of two standard-layout struct ([class.prop]) types is the longest sequence of non-static data members and __bit-fields__ in declaration order..."_ – Richard Critten Jun 21 '22 at 15:04
  • 4
    It's (at a minimum) implementation defined behavior because bitfields layout isn't specified by the standard. Even with the expected packing of the bitfield you don't know if generation is the LSB or MSB of the uint32_t. If you already rely on the IBD of the bitfield layout you can also depend on the IDB of how the union works for types without common initial sequence. – Goswin von Brederlow Jun 21 '22 at 15:51
  • Note that the two members of the union have no *common initial sequence*. That being said, Goswin's comment about IDB is pragmatic. Unless portability or strict compliance to the standard are concerns. – Eljay Jun 21 '22 at 16:00
  • Use [`std::bit_cast`](https://en.cppreference.com/w/cpp/numeric/bit_cast) instead of a union if you are able to compile with C++20 – Patrick Roberts Jan 03 '23 at 14:02

1 Answers1

2

By resolution of CWG 645 (for C++11; not sure whether it is supposed to apply to C++98 as DR) the common initial sequence requires corresponding non-static data members or bit-fields in the two classes (by declaration order) to either be both bit-fields (of the same width) or neither be bit-fields.

The wording for that can still be found in [class.mem.general]/23 in the current draft, including an example stating clearly that a bit-field of the same type as a non-static data member will not be part of the common initial sequence.

Therefore the exceptional rule in [class.mem.general]25 allowing access to inactive members in the common initial sequence of standard-layout class members in a union doesn't apply in your case and reading id.value in auto my_id_value = std::uint32_t{id.value}; has undefined behavior.

user17732522
  • 53,019
  • 2
  • 56
  • 105