4

In Agner Fog's "Optimizing software in C++" it is stated that union forces a variable to be stored in memory even in cases where it otherwise could have been stored in a register, which might have performance implications. (e.g. page 148)

I often see code that looks like this:

struct Vector {
    union {
        struct {
            float x, y, z, w;
        };
        float v[4];
    }
};

This can be quite convenient, but now I'm wondering if there might be potential performance hit. I wrote a small benchmark that compares Vector implementations with and without union and there where cases where the Vector without union apparently performed better, although I don't know how trust-worthy my benchmark is. (I compared three implementations: union; x, y, z, w; v[4]. For example, v[4] seemed to be slower when passed by value, although the structs all have the same size.)

My question now is, whether this is something that people consider when writing actual production code? Do you know of cases where it was decided against unions specifically for this reason?

B_old
  • 1,141
  • 3
  • 12
  • 26
  • 1
    I won't consider it unless I've got a serious performance issue, profiled my code, corrected everything else, and finally pinpointed this realllllly specific point by examining the assembly produced by my compiler. – YSC Mar 03 '17 at 14:52
  • 4
    There is no reason why a `union` cannot be stored in a register besides compilers being bad at optimizing, but that rarely happens in such trivial cases. Also the union you are showing is useless without invoking UB. I would consider getting a different [book](https://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list/388282). – nwp Mar 03 '17 at 14:53
  • 5
    From [cppreferene.com](http://en.cppreference.com/w/cpp/language/union#Explanation) : *"It's undefined behavior to read from the member of the union that wasn't most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union."* – François Andrieux Mar 03 '17 at 14:54
  • In Agner Fog's "Optimizing software in C++", the mentioned content is not on page 148 but on page 153. Maybe he updated the book? – jg6 Jun 11 '20 at 18:50

2 Answers2

1

It appears the goal is to provide friendly names for elements of a vector type, and union is not the best way to do that. Comments have pointed out the undefined behavior already, and even if it works its a form of aliasing which limits optimization opportunities.

Instead, avoid the whole mess and just add accessors that name the elements.

struct quaternion
{
    float vec[4];
    float &x() { return vec[0]; }
    float &y() { return vec[1]; }
    float &z() { return vec[2]; }
    float &w() { return vec[3]; }
    const float &x() const { return vec[0]; }
    const float &y() const { return vec[1]; }
    const float &z() const { return vec[2]; }
    const float &w() const { return vec[3]; }
}

In fact, much as Eigen does for its quaternion implementation: https://eigen.tuxfamily.org/dox/Quaternion_8h_source.html

Peter
  • 14,559
  • 35
  • 55
  • Does it really limit optimization opportunities, or is it just a matter of confusion "bad" compilers as nwp suggests? – B_old Mar 03 '17 at 15:22
  • Those `void`s confused me for a while. – nwp Mar 03 '17 at 15:50
  • 2
    Those `void` should not be there. It’s the C way to indicate a function with no parameters (in contrast to an arbitrary number of arguments). In C++ this syntax is valid, but redundant and unusual. A function declaration with an empty parameter list means exactly that: no parameters. Also I strongly suggest adding const qualified overloads. As is the convenience accessors are unusable for a `const quaternion`. – besc Mar 03 '17 at 16:38
  • Fixing the old C syntax and adding const overloads, as suggested in the comments. – Peter May 14 '20 at 16:42
1

My question now is, whether this is something that people consider when writing actual production code?

No. That's premature optimization (the union construct itself also is). Once the code is written in somewhat clean and reliable way, it can be profiled and true bottlenecks addressed. No need to reason above some union for 5 minutes to guess whether it will affect performance somewhere in the future. It either will, or will not, and only profiling can tell.

Ped7g
  • 16,236
  • 3
  • 26
  • 63
  • 1
    Why do you consider the union construct to be premature optimization? I always viewed the x, y, z access as a convenience thing. – B_old Mar 04 '17 at 17:24
  • 1
    @B_old depends how you look on it. If x,y,z is convenient, then why not to keep just that? But then sometimes it's convenient to have an array? But you can have local temporary array of references/pointers ... but that would be slow (if it would compile into true temporary array and everything would be indirected, actually the compiler would maybe work it out), so you put the array directly into definition into union = optimization (for me personally the v[0] is more convenient then x,y,z, as I'm used from stone age to memory not having name, just offsets). – Ped7g Mar 04 '17 at 18:16