3

The code below is an attempt at implementing a constexpr version of offsetof in C++11. It compiles in both gcc 7.2.0 and clang 5.0.0.

It depends on applying std::addressof to members of non-active members of a union.

Is this well-defined C++11? If not, please explain why, including quotes or citations to relevant sections of the standard.

#include <iostream>
#include <cstdint>
#include <memory>

// based on the gist at: https://gist.github.com/graphitemaster/494f21190bb2c63c5516
// original version by graphitemaster

template <typename T1, typename T2>
struct offset_of_impl {
    union U {
        char c;
        T1 m; // instance of type of member
        T2 object;
        constexpr U() : c(0) {} // make c the active member
    };
    static constexpr U u = {};

    static constexpr std::ptrdiff_t offset(T1 T2::*member) {
        // The following avoids use of reinterpret_cast, so is constexpr.
        // The subtraction gives the correct offset because the union layout rules guarantee that all
        // union members have the same starting address.
        // On the other hand, it will break if object.*member is not aligned.
        // Possible problem: it uses std::addressof on non-active union members.
        // Please let us know at the gist if this is defined or undefined behavior.
        return (std::addressof(offset_of_impl<T1, T2>::u.object.*member) - 
            std::addressof(offset_of_impl<T1, T2>::u.m)) * sizeof(T1);
    }
};

template <typename T1, typename T2>
constexpr typename offset_of_impl<T1, T2>::U offset_of_impl<T1, T2>::u;

template <typename T1, typename T2>
inline constexpr std::ptrdiff_t offset_of(T1 T2::*member) {
    return offset_of_impl<T1, T2>::offset(member);
}

struct S {
    S(int a_, int b_, int c_) : a(a_), b(b_), c(c_) {}
    S() = delete;
    int a;
    int b;
    int c;
};

int main()
{
    std::cout << offset_of(&S::b);   
}

For reference, here is a sandbox version to play with: https://wandbox.org/permlink/rKQXopsltQ51VtEm

And here is the original version by graphitemaster: https://gist.github.com/graphitemaster/494f21190bb2c63c5516

Ross Bencina
  • 3,822
  • 1
  • 19
  • 33
  • All members have the same address, active or not... and even if that address is typed such that it belongs to an inactive member, I can't see why it would be a problem so long as that address were not dereferenced, for reasons given at my suggested dupe around `[basic.life]`. And in case anyone is worried about `std::addressof()` specifically, it's just a bunch of cast acrobatics and doesn't dereference or otherwise read the object. – underscore_d Apr 11 '18 at 13:18
  • `[class.union]` states *Each non-static data member is allocated as if it were the sole member of astruct. All non-static data members of a union object have the same address.* but I'm not sure if that actually makes this okay. – NathanOliver Apr 11 '18 at 13:20
  • @underscore_d I agree https://stackoverflow.com/questions/48188737/is-pointer-arithmetic-on-inactive-member-of-a-union-ub might be a partial answer, if the answer there was asserted with confidence, which it isn't. Note that in this question I'm navigating the union member contents with `.` – Ross Bencina Apr 11 '18 at 13:22
  • @RossBencina but only to get the address of its member, which also isn't alive, but is subject to the same limited allowances for things like taking the address as the union itself is. – underscore_d Apr 11 '18 at 13:23
  • 2
    The thing that worries me is that the code appears to presume the member is aligned in multiples of its own size. This may be true for primitive types in general, but I can imagine lots of cases where T1 is a struct that this rule could easily be broken. – Gem Taylor Apr 11 '18 at 13:42
  • I can see that the union is being used here to avoid the object from being constructed, and that is "clever", I guess, though it wastes/reserves static space for a whole U, but for the rest I think I prefer the traditional subtraction using size_t or char* cast: `return (size_t)(addressof(instT2.*memberT1)) - (size_t)(addressof(instT2));` – Gem Taylor Apr 11 '18 at 13:59
  • @GemTaylor: the original goal was to also make the expression constexpr by avoiding `reinterpret_cast` (your `(size_t)` cast) however that failed without me realising it at the time. As far as wasting space, since U is not used I am assuming that it will be optimised away. – Ross Bencina Apr 11 '18 at 14:42
  • @RossBencina I don't think the space can be optimised away, as you have taken the address of it. Indeed without the declaration of this unique space, you couldn't definition-legally take the address of the offsetted member. – Gem Taylor Apr 11 '18 at 15:36

1 Answers1

3
union U { int a; int b; };
U u;
u.a = 0; // (1)
int* pub = &u.b;

Yes, this is well defined, but there is restrictions on the way one can use pub. Note : taking the address of an object with the operator & or with std::addressof is similar1 unless a custom operator & is defined for that object's type.

[class.union]/1
In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended.

So on line marked (1), the lifetime of u.b has not started yet, but the storage that object will occupy has been allocated. Following :

[basic.life]/6
Before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that represents the address of the storage location where the object will be or was located may be used but only in limited ways.


1) Except it would, as user Quentin noted, bind a reference to u.b, but it's also OK as per [basic.life]/7:

Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see [class.cdtor]. Otherwise, such a glvalue refers to allocated storage ([basic.stc.dynamic.allocation]), and using the properties of the glvalue that do not depend on its value is well-defined.

YSC
  • 38,212
  • 9
  • 96
  • 149
  • Would you mind explicitly commenting on whether the way the results of `std::addressof` are used in the code example are well defined? – Ross Bencina Apr 11 '18 at 13:28
  • 1
    There's a subtle difference in that `std::addressof` binds a reference to that inactive member, and reference must be initialized to refer to an actual object. But I think there is special-casing for this as well. – Quentin Apr 11 '18 at 14:15
  • @Quentin I didn't know, thank you. – YSC Apr 11 '18 at 18:48
  • sh*t I'm the one who answered the duplicate and I have no memory of it. – YSC Apr 11 '18 at 18:50
  • 1
    Don't worry yet. One time I tried to downvote one of my own answers... – Quentin Apr 11 '18 at 19:20