0

In the following code, can the value of int be predicted ( how ? ), or it is just the garbage ?

union a
{
    int i;
    char ch[2];
};
a u;
u.ch[0] = 0;
u.ch[1] = 0;
cout<<u.i;
}
timrau
  • 22,578
  • 4
  • 51
  • 64
cirronimbo
  • 909
  • 9
  • 19
  • possible duplicate of [Set all bytes of int to (unsigned char)0, guaranteed to represent zero?](http://stackoverflow.com/questions/11138188/set-all-bytes-of-int-to-unsigned-char0-guaranteed-to-represent-zero) – R. Martinho Fernandes Aug 10 '12 at 07:34
  • 1
    Its undefined behavior, but it might still work. – askmish Aug 10 '12 at 07:35
  • 1
    @R.MartinhoFernandes: Somewhat related, but not a duplicate at all, in my opinion. – Gorpik Aug 10 '12 at 07:39
  • @Gorpik what's the difference? – R. Martinho Fernandes Aug 10 '12 at 07:40
  • @R.MartinhoFernandes: For once, this deals with a `union`, the other question with a plain `int`. Additionally, the `char`s in the `union` don't necessarilty cover the same memory as the `int` (in most implementations they actually don't). As Luchian and askmish correctly state, this is UB, while the other is not. – Gorpik Aug 10 '12 at 07:43
  • Oh, you're right. I missed that this only set two bytes. But I don't think this is UB because I don't see a difference between `std::memset (reinterpret_cast (&u.i), (unsigned char)0, 2);` and `char* p = &u.ch; std::memset (p, (unsigned char)0, 2);`. – R. Martinho Fernandes Aug 10 '12 at 07:45
  • @LuchianGrigore: Its not coming UB, but less. – cirronimbo Aug 10 '12 at 07:45
  • @R.MartinhoFernandes: There talking about setting all 4 bytes to zero, which will surely result in zero integer value, I know, even if I do u.ch[] = { 0,0,0,0 }, I'm getting zero only. But the problem is that here only two bytes are involved. – cirronimbo Aug 10 '12 at 07:46
  • That's not how you use a union. But the GCC, for example, explicitly allows this kind of type-punning as an extension. Anyways, in case `sizeof(int)>2` one still cannot say what value `u.i` would have. – sellibitze Aug 10 '12 at 07:46
  • @cirronimbo - It will likely, but not *surely* result in a zero `int`. There are no guarantees in the language that all bits in an object are part of the value. The `int` *could* have some type bits, telling the machine that it is an `int`. – Bo Persson Aug 10 '12 at 07:58
  • possible duplicate of [A question about union in C](http://stackoverflow.com/questions/1812348/a-question-about-union-in-c) – Griwes Aug 11 '12 at 14:43

1 Answers1

3

I would say that depends on the size of int and char. A union contains the memory of the largest variable. If int is 4 bytes and char[2] represents 2 bytes, the int consumes more memory than the char-array, so you are not initialising the full int-memory to 0 by setting all char-variables. It depends on your memory initialization mechanisms but basically the value of the int will appear to be random as the extra 2 bytes are filled with unspecified values.

Besides, filling one variable of a union and reading another is exactly what makes unions unsafe in my oppinion.

If you are sure that int is the largest datatype, you can initialize the whole union by writing

union a
{
    int i;
    char ch[2];
};

void foo()
{
    a u = { 0 };  // Initializes the first field in the union
    cout << u.i;
}

Therefore it may be a good idea to place the largest type at the beginning of the union. Althugh that doesn't garantuee that all datatypes can be considered zero or empty when all bits are set to 0.

Excelcius
  • 1,680
  • 1
  • 14
  • 31
  • And it's coincidentally one of the two primary uses of `union`. (The other being simple polymorphism, which is irrelevant in C++.) – Stefan Majewsky Aug 10 '12 at 07:52
  • In C++, that use of a `union` is quite useless, since it can be replaced by `reinterpret_cast`. Really useful in C though. – Morwenn Aug 10 '12 at 07:55
  • The result also depends on the processor's endianness. Different results will be observed, when the code is compiled on MIPS (where char[0] will be the most significant byte) and x86 processors (that are little endian and char[0] will be the least significant byte) for example. – Maksim Skurydzin Aug 10 '12 at 07:57
  • 1
    Accessing a member of the union other than the last one stored into does not cause undefined behavior. Per C 1999 6.2.6.1 7, the bytes of a member that do not correspond to bytes of the member stored into take unspecified values. Thus, the program must behave as if the member has **some** value; which is different from undefined behavior because the latter permits any behavior. (There can be additional complications from trap representations, which I will not detail here, and they generally do not apply to simple integer types.) – Eric Postpischil Aug 10 '12 at 10:55
  • You are right, thanks, it's not really undefined behavior. I edited my answer. I just thought of it as undefined behavior because the program might expect the int to be initialized to 0, while the value can be anything, really. – Excelcius Aug 10 '12 at 11:37