swap non-active std::unique_ptr data members for union

Question

Given a union:

#include <iostream>
#include <memory>
#include <type_traits>
#include <vector>

#include <cassert>
#include <cstdlib>

struct A { int a; };
struct B { int b; };

template< typename X >
struct S
{

    std::size_t tag;

    std::unique_ptr< X > x;

};

union U
{
    S< A > a;
    S< B > b;

    U(A x) : a{0, std::make_unique< A >(x)} { ; }
    U(B x) : b{1, std::make_unique< B >(x)} { ; }

    std::size_t tag() { return a.tag; }

    ~U()
    {
        switch (tag()) {
        case 0 : {
            a.~S< A >();
            break;
        }
        case 1 : {
            b.~S< B >();
            break;
        }
        default : assert(false);
        }
    }

    void
    swap(U & u) noexcept
    {
        a.x.swap(u.a.x);
        std::swap(a.tag, u.a.tag);
    }

};

static_assert(std::is_standard_layout< U >{});

int
main()
{
    U a{A{ 0}};
    U b{B{~0}};
    assert((a.tag() == 0) && (a.a.x->a ==  0));
    assert((b.tag() == 1) && (b.b.x->b == ~0));
    a.swap(b);
    assert((a.tag() == 1) && (a.b.x->b == ~0));
    assert((b.tag() == 0) && (b.a.x->a ==  0));
    return EXIT_SUCCESS;
}

U::tag() funcion is correct due to it permittable to inspect common initial subsequence of alternative data members in U-like unions.

U::swap() works, but is it legal for std::unique_ptrs? Is it allowed to swap non-active std::unique_ptrs alternative data members of U-like unions?

It seems to be permittable due to simple nature of std::unique_ptr< X >: it is just a wrapper over X * and for any A and B I sure static_assert((sizeof(A *) == sizeof(B *)) && (alignof(A *) == alignof(B *))); holds and pointers arrangement is identical for all types (except pointers to data members and member functions of classes). Is it true?

Example code works fine. But very likely there is UB if we read the standard.

Are you sure that `std::unique_ptr` is standard-layout? Could you add a static assertion? — Kerrek SB, Sep 23 '15 at 11:40
@CoffeeandCode [Here is description](http://talesofcpp.fusionfenix.com/post-20/eggs.variant---part-ii-the-constexpr-experience) of some backgrounds. — Tomilov Anatoliy, Sep 23 '15 at 11:55

score 1 · Answer 1 · edited May 23 '17 at 12:14

IMHO, you have formal Undefined Behaviour, because you always access the a part of the unions, even if last written was b.

Of course it works, because except for its management, a unique_ptr just contains a raw pointer and a stored deleter. Pointers to any type have same representation, and except for alignement question, it is safe to convert a pointer to X to a pointer to Y and back. So at low level if is safe to swap raw pointers. It could be more implementation dependant, but I assume it is also safe to swap stored deleters, because what is actually stored is normally an address. And anyway, for types struct A and struct B, the destructors are simply no-op.

The only thing that could cause you code fail, would be if the compiler enforced the rule that only last written member of an union can be accessed, except for the common initial subsequence. For current compilers, I am pretty sure that none enforce that, so it should work.

But in a question that I once ask about another possible UB case, Hans Passant gave a link to research work on advanced compilers able to detect buffer overflows. I really think that same technics could be used to enforce rules on access to union members, so such compilers could raise exception at run-time with your code.

TL/DR: this code should work with all current known compilers, but as is is not strictly standard conformant, future compilers could trap with it. As such I call this formal undefined behaviour.

score 1 · Accepted Answer · answered Sep 23 '15 at 12:40

from § 9.5 Unions

specifically the note about standard layout types:

... One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence (9.2), and if an object of this standard-layout union type contains one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of standard-layout struct members ...

So the common initial sequence is allowed to be used for either union member.

In your case the common initial sequence is definitely std::size_t tag. Then we need to know if std::unique_ptr<T> will be the same for all T so it can also be treated as part of the common initial sequence:

§ 20.8.1 Class template unique_ptr
[1] A unique pointer is an object that owns another object and manages that other object through a pointer. More precisely, a unique pointer is an object u that stores a pointer to a second object p ...

Yep. But how do we know all pointers will be represented the same? Well, in your case:

§ 3.9.2 Compound types
[ 3 ] ... The value representation of pointer types is implementation-defined. Pointers to cv-qualified and cv-unqualified versions (3.9.3) of layout-compatible types shall have the same value representation and alignment requirements ...

So we can rely on the value of the pointer stored in std::unique_ptr being value representable in the other member of the union.

So no, no undefined behaviour here.

The common initial sequence rule applies to _standard-layout_ unions only. `U` is not guaranteed to be _standard-layout_, because `std::unique_ptr` is not required to be. Furthermore, even if the simplistic depiction of `unique_ptr` representation would hold, pointers to different arbitrary object types are not _layout-compatible_, which is a requirement for determining the common initial sequence. — K-ballo, Sep 25 '15 at 12:58

swap non-active std::unique_ptr data members for union

2 Answers2