How does an union determine max size from a list of objects?

Question

I am not sure the question is well put, because I understood how, but I don't know to write the questions with the thing I don't understand. Here it is:

I have some classes:

class Animal{};
class Rabbit{}: public Animal;
class Horse{}: public Animal;
class Mouse{}: public Animal;
class Pony{}: public Horse;

My goal was to find the maximum size from this object list in order to use it in memory allocation afterwards. I've stored each sizeof of the object in an array then took the max of the array. The superior(to whom I send the code for review) suggested me to use an union in order to find maximum size at pre-compilation time. The idea seemed very nice to me so I've did it like this:

typedef union
{
  Rabbit rabbitObject;
  Horse horseObject;
  Mouse mouseObject;
  Pony ponyObject;
} Size;

... because an union allocates memory according to the greatest-in-size element. The next suggestion was to do it like this:

typedef union
{
   unsigned char RabbitObject[sizeof(Rabbit)];
   unsigned char HorseObject[sizeof(Horse)];
   unsigned char MouseObject[sizeof(Mouse)];
   unsigned char PonyObject[sizeof(Pony)];
} Interesting;

My question is:

How does Interesting union get the maximum size of object? To me, it makes no sense to create an array of type unsigned char, of length sizeof(class) inside it. Why the second option would solve the problem and previous union it doesn't? What's happening behind and I miss?

PS: The conditions are in that way that I cannot ask the guy personally.

Thank you in advance

It seems to me like a `unsigned char RabbitObject[sizeof(Rabbit)];` would have the same size as `Rabbit`. I'm not clear on why you feel the arrays "make no sense". — François Andrieux, Mar 14 '18 at 14:37
I simply don't get it. I need the max size of all objects, so to me it's logic to create an union with one instance of each object inside, then take "sizeof(union)" in order to get the max. Why would I need array of unsigned char of sizeof(class)? — lukuss, Mar 14 '18 at 14:41
If you can use c++17 it might be clearer to go back to a `constexpr` version of your first implementation. Something like `constexpr std::array sizes = {{sizeof(Rabbit), sizeof(Horse), ...}}; constexpr std::size_t maxSize = std::max_element(std::begin(sizes), std::end(sizes));` — 0x5453, Mar 14 '18 at 14:43

SergeyA · Answer 1 · 2018-03-14T15:05:14.633

6

The assumptions are incorrect, and the question is moot. Standard does not require the union size to be equal of the size of the largest member. Instead, it requires union size to be sufficient to hold the largest member, which is not the same at all. Both solutions are flawed is size of the largest class needs to be known exactly.

Instead, something like that should be used:

template<class L, class Y, class... T> struct max_size
        : std::integral_constant<size_t, std::max(sizeof (L), max_size<Y, T...>::value)> { };
template<class L, class Y> struct max_size<L, Y>
        : std::integral_constant<size_t, std::max(sizeof (L), sizeof (Y))> { };

As @Caleth suggested below, it could be shortened using initializer list version of std::max (and template variables):

template<class... Ts>
constexpr size_t max_size_v = std::max({sizeof(Ts)...});

edited Mar 14 '18 at 15:05

answered Mar 14 '18 at 14:38

SergeyA

61,605
5
78
137

which assumption? about the union? Correct me please, I really want to understand this, even if it's a simple thingy. – lukuss Mar 14 '18 at 14:43
@lukuss, yes. An assumption that union size is equal to the size of the latest element is incorrect. – SergeyA Mar 14 '18 at 14:45
latest? No, greater. I meant, that if I have an union with and "unsigned char" and an "int" - size of union it would be the int size. right? – lukuss Mar 14 '18 at 14:48
improvement: `template struct max_size : std::integral_constant` – Caleth Mar 14 '18 at 14:59
@Caleth, good point. Always forget about initializer list version of it. Will add. – SergeyA Mar 14 '18 at 15:02
Just curious, why do you think that using `sizeof()` on the various types and getting the maximum is any different of a result than the `union` approach? Admittedly the `union` approach is a C style tactic. – Richard Chambers Mar 15 '18 at 16:03
@RichardChambers Because union is not guaranteed to give you exact size. – SergeyA Mar 15 '18 at 16:13
What do you mean by "exact size"? – Richard Chambers Mar 15 '18 at 16:31
@RichardChambers those words have a defined meaning. Exact size means exact size of the largest class. Union is only going to give you the size sufficient to hold the largest class, which is not the same thing – SergeyA Mar 15 '18 at 17:29
What is your interpretation of the following. "The size of a union is sufficient to contain the largest of its non-static data members. **Each non-static data member is allocated as if it were the sole member of a struct.** All non-static data members of a union object have the same address." Does this mean that the size of a `union` is the size of the largest member of the union? (section 9.5 of the standard). – Richard Chambers Mar 15 '18 at 18:24
@RichardChambers, I do not see how it could mean that. The fact that each non-static member is allocated as if... tells us **nothing** about the size of the union. – SergeyA Mar 15 '18 at 18:28
You might have one member that’s, say, 64 bits in size, and needs 64 bit ***alignment***, and another, say 72 bits in size, that only needs 16 bit alignment, but the union size then needs to be rounded up to 128 bits to satisfy the alignment requirement of the 64-bit-sized member. – Will Crawford Mar 17 '18 at 17:44

NathanOliver · Answer 2 · 2018-03-14T14:49:10.950

0

You superior suggested you use the array version because a union could have padding. For instance if you have

union padding {
    char arr[sizeof (double) + 1];
    double d;
};

The this could either be of size sizeof(double) + 1 or it could be sizeof (double) * 2 as the union could be padded to keep it aligned for double's (Live example).

However if you have

union padding {
    char arr[sizeof(double) + 1];
    char d[sizeof(double)];
};

The the union need not be double aligned and the union most likely has a size of sizeof(double) + 1 (Live example). This is not guanrteed though and the size can be greater than it's largest element.

If you want for sure to have largest size I would suggest using

auto max_size = std::max({sizeof(Rabbit), sizeof(Horse), sizeof(Mouse), sizeof(Pony)});

edited Mar 14 '18 at 14:49

answered Mar 14 '18 at 14:44

NathanOliver

171,901
28
288
402

1

Both union approaches are wrong. Union has no obligation to has it's size equal to the largest element. – SergeyA Mar 14 '18 at 14:46
@SergeyA I know that. I'm explaining why the superior said what they did – NathanOliver Mar 14 '18 at 14:47
But it was a mistake anyways. Both approaches (array and no array) yield the same (as long as Standard value is concerned) result - a size **sufficient** to hold the largest member, but not necessarily exactly equal. – SergeyA Mar 14 '18 at 14:48
@NathanOliver The superior(architect) suggested the union(not the array, the array was my solution), so I cannot do anything about it. I just wanted to understand why my union was wrong and his was right. And now I see that both are wrong. – lukuss Mar 14 '18 at 14:55
@lukuss That's sad. Try to talk them out of it if you can as they are relying on behavior that could change in the future. – NathanOliver Mar 14 '18 at 14:57
@SergeyA Nathan’s answered the question (correctly), why are you waling on him? :o) – Will Crawford Mar 17 '18 at 17:57

Richard Chambers · Accepted Answer · 2018-03-16T12:02:17.213

The two approaches provide a way to find a maximum size that all of the objects of the union will fit within. I would prefer the first as it is clearer as to what is being done and the second provides nothing that the first does not for your needs.

And the first, a union composed of the various classes, offers the ability to access a specific member of the union as well.

See also Is a struct's address the same as its first member's address? as well as sizeof a union in C/C++ and Anonymous union and struct [duplicate] .

For some discussions on memory layout of classes see the following postings:

Since the compiler is free to add to the sizes of the various components in order to align variables on particular memory address boundaries, the size of the union may be larger than the actual size of the data. Some compilers offer a pragma or other type of directive to instruct the compiler as to whether packing of the class, struct, or union members should be done or not.

The size as reported by sizeof() will be the size of the variable or type specified however again this may include additional unused memory area to pad the variable to the next desirable memory address alignment. See Why isn't sizeof for a struct equal to the sum of sizeof of each member?.

Typically a class, struct, or union is sized so that if an array of the type is created then each element of the array will begin on the most useful memory alignment such as a double word memory alignment for an Intel x86 architecture. This padding is typically on the end of the variable.

The classes are empty now. But what if the classes will have methods and attributes?(They will sure do). What if we have specific constructors inside classes? Does my first method still stands? (this is one of the reasons my superior recommended the second union) — lukuss, Mar 16 '18 at 07:26
@lukuss see the following postings and discussions: [Structure of a C++ Object in Memory Vs a Struct](https://stackoverflow.com/questions/422830/structure-of-a-c-object-in-memory-vs-a-struct), [What does an object look like in memory? duplicate ](https://stackoverflow.com/questions/12378271/what-does-an-object-look-like-in-memory), [memory layout C++ objects closed](https://stackoverflow.com/questions/1632600/memory-layout-c-objects) as well as [The memory model in C++ - Rainer Grimm - Meeting C++ 2016 video 54 mins](https://www.youtube.com/watch?v=e0DsVqZLMzU). — Richard Chambers, Mar 16 '18 at 11:48
@lukuss I really don't see any benefit to the array version because `sizeof()` will provide the same size as the `class` in the union. Class methods are not part of the class memory allocation though the vtable, created if virtual functions, is. The basic rule of thumb is invariant parts of a class such as methods are not part of the `class` memory allocation while variant parts of a class such as data and object specific management data such as vtable is part of the `class` memory allocation used to instantiate an object of that class. — Richard Chambers, Mar 16 '18 at 12:10
The alignment is the maximum of what’s most useful for each member, for the architecture (ABI, really) being compiled for. Padding may appear anywhere *but* the beginning (because inherited C behaviour means that in a `struct` you can cast a pointer to the struct, to one to its first member; it's one way inheritance is done “by hand” in C). — Will Crawford, Mar 17 '18 at 17:51

How does an union determine max size from a list of objects?

3 Answers3