Unexpected Union behaviour

Question

The code below outputs different numbers each time ..
apples.num prints 2 which is correct, and apples.weight prints different numbers each time, it once even printed out "nan", and I don't know why is this happening ..
The really strange thing is that the double (apples.volume) prints out 2.0 ..

Can anybody explain things to me ?

#include <stdio.h>

typedef union {
    short num;
    float weight;
    double volume;
} Count;

int main(int argc, char const *argv[])
{
    Count apples;
    apples.num = 2;

    printf("Num: %d\nWeight: %f\nVolume: %f\n", apples.num, apples.weight, apples.volume);
    return 0;
}

You only set the `short` part of that union to some value. The rest will contain random data on each re-run. — Jongware, Jul 18 '14 at 22:07
So what is a boy suppose to do? Just pick the weight of you apple out of the air — Ed Heal, Jul 18 '14 at 22:07
Hmm, it actually seems C is much more lax on this: *If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6.* and *When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.* — chris, Jul 18 '14 at 22:13
There happens no conversion, your title is deceiving. Unions merely define a collected type which has the size of the largest element inside, and is responsible to hold only one of them at once. You may think of it as a field that you fill with encoding. You once use the encoding `int` and fill it with `2`; you mar decode it with `int` again, but don't expect it to make sense when you use something else to decode it. — Utkan Gezer, Jul 18 '14 at 22:14
@Jongware Sooo .. a union has only one member that is useful for each variable, so for apples, I'd use apple.num, for drinks, drink.volume, etc. Then, how do I set the wanted representation for each variable, and not use the others, or at least generate a warning if I try to ? — Amr Ayman, Jul 18 '14 at 22:36
@AmrAyman, no in the contrary, members of `union`s share the representation and interpret it differently. — Jens Gustedt, Jul 18 '14 at 22:40
There is no standard data type in C that contains "the same meaning in other representations". You cannot express all possible float values as integer, and neither the reverse. (If you could, one of the two types would be obsolete.) — Jongware, Jul 18 '14 at 22:42
@JensGustedt So, how do I define the **main** representation for a single union variable ? — Amr Ayman, Jul 18 '14 at 22:43
@AmrAyman : No. A union simply allows different types to be stored in the same variable. 2.0f has a different bit pattern to 2 - the union does not magically change the bit pattern depending on which member you access. Moreover short, float and double each have a different number of bits. — Clifford, Jul 18 '14 at 22:44
@Jongware Thanks, but I still can't really differentiate between structs and unions, please explain that .. — Amr Ayman, Jul 18 '14 at 22:45
@MattMcNabb, I figured there was still some problem like that. I'm just surprised because I'm pretty sure C++ has a lot of undefined behaviour around not reading the last written member. — chris, Jul 19 '14 at 00:15
@AmrAyman : Perhaps then you should be asking another question. The comments section is not a discussion forum. A `struct` contains all its members all the time in separate memory spaces. A Union contains one of it's members at a time in a single shared memory space. — Clifford, Jul 19 '14 at 08:07

Rudy Velthuis · Accepted Answer · 2014-07-19T08:46:02.810

It seems to me you don't quite understand what a union is. The members of a union are overlapping values (in other words, the three members of a Count union share the same space).

Assuming, just for the sake of demonstration, a short is 16 bits (2 bytes), a float is 32 bits (4 bytes) and a double is 64 bits (8 bytes), then the union is 8 bytes in size. In little-endian format, the num member refers to the first 2 bytes, the weight member refers to the first 4 bytes (including the 2 bytes of num) and the volume member refers to the full 8 bytes (including the 2 bytes of num and the four bytes of weight).

Initially, your union contains garbage, i.e. some unknown bit pattern, let's display it like this (in hex):

GG GG GG GG GG GG GG GG   // GG stands for garbage, i.e. unknown bit patterns

If you set num to 2, then the first two bytes are 0x02 0x00, but the other bytes are still garbage:

02 00 GG GG GG GG GG GG

If you read weight, you are simply reading the first four bytes, interpreted as a float, so the float contains the bytes

02 00 GG GG

Since floating point values have a totally different format as integral types like short, you can't predict what those bytes (i.e. that particular bit pattern) represent. They do not represent the floating point value 2.0f, which is what you probably want. Actually, the "more significant" part of a float is stored in the upper bytes, i.e. in the "garbage" part of weight, so it can be almost anything, including a NaN, +infinity, -infinity, etc.

Similarly, if you read volume, you have a double that consists of the bytes

02 00 GG GG GG GG GG GG

and that does not necessarily represent 2.0 either (although, by chance, it MAY come very close, if by coincidence the right bits are set at the right places, and if the low bits are rounded away when you display such a value).

Unions are not meant to do a proper conversion from int to float or double. They are merely meant to be able to store different kinds of values to the same type, and reading from another member as you set simply means you are reinterpreting a number of bits that are present in the union as something completely different. You are not converting.

So how do you convert? It is quite simple and does not require a union:

short num = 2;
float weight = num; // the compiler inserts code that performs a conversion to float
double volume = num; // the compiler inserts code that performs a conversion to double

It may be worth noting that the bits that have been set are the least significant 16 bits of the significand. The bits that control whether you have a NaN, Infinity, large finite, tiny finite, positive, negative etc. remain garbage. — Patricia Shanahan, Jul 18 '14 at 23:29
I could do that, but I am not sure if mentioning the terms `exponent`, `significand`, etc. would not confuse the reader more than it helps him. He just has to know that floating point values have a different format. — Rudy Velthuis, Jul 18 '14 at 23:32
I phrased my comment in terms I thought you would understand. You are doing a great job of expressing information in more basic terms. Meanwhile, has anyone actually said how to correctly convert an integer value to the corresponding float or double value? — Patricia Shanahan, Jul 18 '14 at 23:38
@RudyVelthuis so, when I do this: `apples = {0};`, that defaults the whole 8 bytes to 0 ? — Amr Ayman, Jul 18 '14 at 23:52
@AmrAyman: note that if you set one of the members of a union, you are overwriting (or partly overwriting) the other members, i.e. you can only use one of the members and should not access the others. If you need to store `num`, `weight` and `volume` separately, you should use a `struct` to keep them together. — Rudy Velthuis, Jul 19 '14 at 00:07

score 2 · Answer 2 · answered Jul 19 '14 at 08:20

2

If you access a union via the "wrong" member (i.e. a member other than the one it was assigned through), the result will depend on the semantics of the particular bit pattern for that type. Where the assigned type has a smaller bit-width that he accessed type, some of those bits will be undefined.

answered Jul 19 '14 at 08:20

Clifford

88,407
13
85
165

@AmrAyman : I am not sure what you mean. I suspect that what you actually need is a type cast. – Clifford Jul 19 '14 at 19:38

score 1 · Answer 3 · edited May 23 '17 at 12:32

1

You are accessing uninitialized data. It will provide undefined behavior (ie: unknown values in this case). You also likely mean to use a struct instead of a union.

#include <stdio.h>

typedef union {
    short num;
    float weight;
    double volume;
} Count;

int main(int argc, char const *argv[])
{
    Count apples = { 0 };
    apples.num = 2;

    printf("Num: %d\nWeight: %f\nVolume: %f\n", apples.num, apples.weight, apples.volume);
    return 0;
}

Initialize the union by either zeroing it out, or setting the largest member to a value. Even if you set the largest member, the other values might not even make sense. This is commonly used for creating a byte/word/nibble/long-word data type and making the individual bits accessible.

edited May 23 '17 at 12:32

Community

1
1

answered Jul 18 '14 at 22:07

Cloud

18,753
15
79
153

2

In all fairness, this is still not an *int to float conversion*, as the OP appears to think it is. – Jongware Jul 18 '14 at 22:10
He could always do something like `apples.weight=1.0f` and checking for validity. – Cloud Jul 18 '14 at 22:10

Unexpected Union behaviour

3 Answers3