Practical use of Anonymous union in real world C++ programing

Question

I know that we can access anonymous unions without creating it's object(without dot), but could anybody please explain,what is the use of anonymous unions in real world c++ programing?

@FrançoisAndrieux: That's a different language with **very** different rules for unions :-( — Kerrek SB, Jul 26 '17 at 13:59
@StoryTeller: This is a maddening curiosity: I don't think there exists a real world example in C++ which makes it exceedingly narrow. — Bathsheba, Jul 26 '17 at 14:22
@Bathsheba - I concur, which why I didn't vote to close. But still I have a nagging feeling this is off-topic. — StoryTeller - Unslander Monica, Jul 26 '17 at 14:23
@StoryTeller: Damn, there's a good answer now - why didn't I think of that? — Bathsheba, Jul 26 '17 at 14:32
@Bathsheba - Actually had another example come back to me after a good night's sleep. What do you think? — StoryTeller - Unslander Monica, Jul 27 '17 at 07:29

score 6 · Answer 1 · answered Jul 26 '17 at 14:22

I have mostly used unions to store multiple different types of elements in the same contiguous storage without resorting to dynamic polymorphism. Thus, every element of my union is a struct describing the data for the corresponding node type. Using an anonymous union mostly gives a more convenient notation, i.e. instead of object.union_member.struct_member, I can just write object.struct_member, since there is no other member of that name anyways.

A recent example where I used them would be a rooted (mostly binary) tree which has different kinds of nodes:

struct multitree_node {
    multitree_node_type type;
    ...
    union {
        node_type_1 n1;
        node_type_2 n2;
        ...
    };
};

Using this type tag type I am able to determine which element of the union to use. All of the structs node_type_x have roughly the same size, which is why I used the union in the first place (no unused storage).

With C++17, you would be able to do this using std::variant, but for now, using anonymous unions are a convenient way of implementing such 'polymorphic' types without virtual functions.

Of course. This is an excellent example. Why did I not think of this one? — Bathsheba, Jul 26 '17 at 14:32
For the record, this is normally called a [*discriminated* or *tagged* union](https://en.wikipedia.org/wiki/Tagged_union) (with the `type` field being the "tag"). — Matteo Italia, Jul 26 '17 at 17:17

geza · Answer 2 · 2017-07-26T16:22:45.873

4

Here's a real-world example:

struct Point3D {
    union {
        struct {
            float x, y, z;
        };
        struct {
            float c[3];
        };
    };
};
Point3D p;

You can access p's x/y/z coordinates with p.x, p.y, p.z. This is convenient.

But sometimes you want to access point as a float[3] array. You can use p.c for that.

Note: Using this construct is Undefined Behavior by the standard. But, it works on all compilers I've met so far. So, if you want to use such a construct, be aware, that this may broke some day.

edited Jul 26 '17 at 16:22

answered Jul 26 '17 at 14:12

geza

28,403
6
61
135

4

This is UB, I'm afraid. The standard doesn't promise there won't be padding between `x`, `y` and `z` as it does for the elements of `c`, for one. – StoryTeller - Unslander Monica Jul 26 '17 at 14:13
2

*"it works on all compilers I've met so far"* - Is highly misleading as a re-assurance. The same compiler that accepts it now can produce garbage code when you build on a new system. – StoryTeller - Unslander Monica Jul 26 '17 at 14:16
@StoryTeller: how should I edit the post to not be misleading? – geza Jul 26 '17 at 14:17
1

@geza include a link to the documentation for the compiler extension that allows this behaviour. _"Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union._" see: http://en.cppreference.com/w/cpp/language/union – Richard Critten Jul 26 '17 at 14:20
Personally, I don't think posts that exhibit UB as something one can use should be posted at all... But if you change it to be dependent on a compiler extension that gurantees correctness, and specify it as such... – StoryTeller - Unslander Monica Jul 26 '17 at 14:20
This is not real world. This is cat-eating demon-sneezing undefined behaviour world. – Bathsheba Jul 26 '17 at 14:21
1

@StoryTeller "it works on all compilers I've met so far" - not misleading at all. "it works on all compilers" - misleading. – nicomp Jul 26 '17 at 14:21
@nicomp - It's highly misleading. Let's not play coy here, it was meant as reassurance to this being okay. – StoryTeller - Unslander Monica Jul 26 '17 at 14:22
@StoryTeller: But I'd like to post it, because it is an acutally working code example. Yes, I understand, that is is UB, etc., but it works. And to be honest, I cannot think anything which could make this code broke. Why would ever a compiler add padding between floats? – geza Jul 26 '17 at 14:23
Very well. I've just made a change to the gcc trunk, and this version is a C++ standards compliant compiler that will also eat your cat if it sees code like this. Do you want a copy, and a new kitten? – Bathsheba Jul 26 '17 at 14:23
@StoryTeller It is OK if it works and the target compiler never changes. Much production code is written for one and only one compiler. – nicomp Jul 26 '17 at 14:23
@RichardCritten: no. it doesn't just appear to work. It does work. For all compilers I've met. – geza Jul 26 '17 at 14:24
1

@nicomp - Much production code is written once for various platforms. I already told geza how he could fix the post. Don't defend the indefensible, in purely standard C++, this is bad code – StoryTeller - Unslander Monica Jul 26 '17 at 14:24
@nicomp: Assuming that is very, very, very naughty. How many nights have I had utterly ruined by flashy junior quants writing truck loads of UB that promptly fails on the production gcc build?! – Bathsheba Jul 26 '17 at 14:24
@RichardCritten Everything that works **appears** to work. – nicomp Jul 26 '17 at 14:24
The standard says "*The alignment of a complete object type is an implementation-defined integer value representing a number of bytes; an object is allocated at an address that meets the alignment requirements of its object type.*". Which means this code is *implementation-defined*. No? – rustyx Jul 26 '17 at 14:24
@geza the problem you have is without documentation from the compiler vendor, how can you prove it will work for __ALL__ programs. – Richard Critten Jul 26 '17 at 14:25
@Bathsheba That's an operational problem. It's not the compiler's fault if your newbies can't/won't test their code. – nicomp Jul 26 '17 at 14:25
1

@RustyX - The standard also forbids reading from a union member that wasn't last written to, it's explicitly undefined unless a "common initial sequence" of fields is present. This isn't the case. It's UB. – StoryTeller - Unslander Monica Jul 26 '17 at 14:26
1

@nicomp: They did. But it had undefined constructs. Which gcc on the most aggressive optimisations settings mashed up. I can hardly say to trading "we need to increase our grid by 50% since we can't optimise our builds anymore" can I? – Bathsheba Jul 26 '17 at 14:26
@Bathsheba: then I won't use your version of GCC, which behaves against the common sense. – geza Jul 26 '17 at 14:26
Be assured, neither will I: My two Birmans Bathsheba and Don Juan will be thankful. – Bathsheba Jul 26 '17 at 14:27
@RichardCritten that's less of a problem when you *have* such documentation from all the compiler vendors *that you use*, and *assurances* from said vendors that it will stay that way – Caleth Jul 26 '17 at 14:28
@geza - Compiler writers are always on the lookout to produce the most efficient code possible. UB is the standards way of giving them a narrow contract to exploit towards that end. Mine or your "common sense" plays no part in that. – StoryTeller - Unslander Monica Jul 26 '17 at 14:29
1

@geza: for what it's worth I think your code will start failing on 128 bit CPUs ,due to padding between the `x`, `y`, and `z` members, although you might find that by then such a compiler will use the same number of bits for a `float` as they do a `double`. – Bathsheba Jul 26 '17 at 14:30
Reading from a union member that wasn't last written to is a common and *only* "solution" to the aliasing issue and is widely tolerated by compilers. The first comment in any case is incoherent/misleading. – rustyx Jul 26 '17 at 14:33
@Bathsheba: so you say, the the array would be 12-byte, but the struct 24? – geza Jul 26 '17 at 14:33
@geza: Quite possibly yes. I imagine compilers will give up aligning on 4 byte boundaries. – Bathsheba Jul 26 '17 at 14:34
@Bathsheba: yes, I understand, but why would it give up on just in the struct? If float "becomes" double, then the array should be 24 bytes as well, shouldn't it? – geza Jul 26 '17 at 14:35
@StoryTeller: yes, that's true, and usually I don't write any code which has UB. But this is an exception. GCC has an exception rule for unions. MSVC doesn't really use strict aliasing at all. So, in the current state of the world, my example is perfectly fine. And I cannot think any reasons, why this code would break. On the contrary, I think that maybe, some day, there would be guarantees in the standard for cases like this, because there is no further optimizing potential in this kind of usage at all. – geza Jul 26 '17 at 14:40
@StoryTeller: I've edited the code little bit. What do you think, is it still UB? "If a standard-layout union contains two or more standard-layout structs that share a common initial sequence, and if the standard-layout union object currently contains one of these standard-layout structs, it is permitted to inspect the common initial part of any of them". Related question: https://stackoverflow.com/questions/45332326/are-these-2-structs-layout-compatible – geza Jul 26 '17 at 16:25
@geza - Afraid it is. One structure has 3 distinct members of type `float`. The other only one member of type `float[3]`. The sequences do not match. Not a bad attempt, however. – StoryTeller - Unslander Monica Jul 26 '17 at 16:29

score 2 · Answer 3 · answered Jul 27 '17 at 07:28

I actually remembered a use case I came across a while back. You know bit-fields? The standard makes very little guarantees about their layout in memory. If you want to pack binary data into an integer of a specific size, you are usually better off doing bit-wise arithmetic yourself.

However, with unions and the common initial sequence guarantee, you can put all the boilerplate behind member access syntax. So your code will look like it's using a bit-field, but will in fact just be packing bits into a predictable memory location.

Here's a Live Example

#include <cstdint>
#include <type_traits>
#include <climits>
#include <iostream>

template<typename UInt, std::size_t Pos, std::size_t Width>
struct BitField {
    static_assert(std::is_integral<UInt>::value && std::is_unsigned<UInt>::value,
                  "To avoid UB, only unsigned integral type are supported");
    static_assert(Width > 0 && Pos < sizeof(UInt) * CHAR_BIT &&  Width < sizeof(UInt) * CHAR_BIT - Pos,
                  "Position and/or width cannot be supported");

    UInt mem;

    BitField& operator=(UInt val) {
        if((val & ((UInt(1) << Width) - 1)) == val) {
            mem &= ~(((UInt(1) << Width) - 1) << Pos);
            mem |= val << Pos;
        }
        // Should probably handle the error somehow
        return *this;
    }

    operator UInt() {
        return (mem >> Pos) & Width;
    }
};

struct MyColor {
    union {
        std::uint32_t raw;
        BitField<std::uint32_t, 0,  8> r;
        BitField<std::uint32_t, 8,  8> g;
        BitField<std::uint32_t, 16, 8> b;
    };
    MyColor() : raw(0) {}
};


int main() {
    MyColor c;
    c.r = 0xF;
    c.g = 0xA;
    c.b = 0xD;

    std::cout << std::hex << c.raw;
}

Yup, that's portable C++. Nice. – Bathsheba Jul 27 '17 at 07:31 — Bathsheba, Jul 27 '17 at 07:31

Practical use of Anonymous union in real world C++ programing

3 Answers3