4

If I have a C++ union which contains an array. I would like to access each element of the array using a set of unique identifiers. (This may seem like a strange thing to want. In my application I have a union which contains pointers to cells in 8 directions, which represent how some object can move between cells. Sometimes it is convenient to write algorithms which work with indexes of arrays, however that is not convenient for an end user who would prefer to work with named identifiers rather than less obvious indices.)

Example:

union vector
{
    double x;
    double y;
    double data[2];
}

I believe that x and y "are the same thing", so really one would have to:

struct v
{
    double x, y;
}

union vector
{
    v data_v_format;
    double data_arr_format[2];
}

Which you then use:

vector v1;
v1.data_arr_format[0] = v1.data_v_format.y; // copy y component to x

Unfortunately this adds an ugly layer of syntax to the union. Is there any way to accomplish the original task as specified by the syntax:

union vector
{
    double x;
    double y;
    double data[2];
}

Where x is equivalent to data[0] and y is equivalent to data[1]?

I could write a class to do this, where the "logically named identifiers become functions, returning a single component of the array" - but is there a better way?

timrau
  • 22,578
  • 4
  • 51
  • 64
FreelanceConsultant
  • 13,167
  • 27
  • 115
  • 225
  • My C++ is *extremely* rusty. However, what happens if you try making a union of ```double x, y``` and ```double data[2]```? – Eric Galluzzo Dec 14 '15 at 14:00
  • @EricGalluzzo I have no idea what happens with GCC 5 or what is supposed to happen... – FreelanceConsultant Dec 14 '15 at 14:11
  • This question has a lot of good information related to this: http://stackoverflow.com/questions/2253878/why-does-c-disallow-anonymous-structs-and-unions – Rob K May 02 '16 at 17:16

2 Answers2

4

Anyway, even if you will find a way, reading from inactive union field, i.e. reading not from the last one being written into, is UB. This actually means that often seen example of converting IP between 4 octets and int using union is illegal.

You can use accessors:

struct vec
{
    double data[2];
    double& x() {return data[0];}
    double& y() {return data[1];}
};

Alternatively you can look into property implementation in C++. It would create a proxy object, accesses to which will be redirected to specific array elements.

Yet another way is to use references, but this will increase size of your struct (+pointer size per reference):

struct vec
{
    double data[2];
    double& x = data[0];
    double& y = data[1];
};
Revolver_Ocelot
  • 8,609
  • 3
  • 30
  • 48
  • Uprooted: it's quite common to use unions for type puns, but such use in fact produces undefined behavior. – Pete Becker Dec 14 '15 at 14:12
  • 2
    @PeteBecker , It is declared UB by Standard, but such type punning is often guaranteed as compiler extention. UB means anything can happen, even guaranteed, well defined and reproducible behavior. Maybe mentioning of such compiler-depended extention should be made. – Revolver_Ocelot Dec 14 '15 at 14:15
  • 1
    Note: [`std::bitcast`](https://en.cppreference.com/w/cpp/numeric/bit_cast) provides a defined way of doing that sort of conversion now. Prior to C++20 you can achieve the same effect with a memcpy (which will be trivially optimised out, producing the same effect). – Chris Kitching May 15 '19 at 20:56
1

Although not allowed in (standard) C++, in C (since C11) you can use an anonymous struct:

// not standard C++
union vector {
    struct {
        double x;
        double y;
    };
    double arr[2];
};

Anonymous structs are also supported by some C++ compilers (including GNU, MSVC and Clang) as an extension to the language. In standard C++, you'll need to settle for unnamed struct:

union vector {
    struct {
        double x;
        double y;
    } data;
    double arr[2];
};

This is essentially the same as your example, so you need the ugly layer of syntax v.data.x and so on. This is just simpler since you don't need to name the inner struct; You only need to name the member that is an instance of the struct.

A word about your comment:

v1.data_arr_format[0] = v1.data_v_format.y; // copy y component to x

You comment that you copy y to x. Do realize that reading v1.x after writing to v1.data_arr_format has technically undefined behaviour.

I give you that the struct probably has no padding at all since double probably doesn't have higher alignment requirement that it's size and therefore probably has same representation as the array. So on most implementations, this type punning would probably work as intended, even if that's not guaranteed by the standard.

eerorika
  • 232,697
  • 12
  • 197
  • 326