Is this interpretation of union behavior accurate?

Question

^^^ THIS QUESTION IS NOT ABOUT TYPE PUNNING ^^^

It is my understanding that an object contained in a union can only be used if it is active, and that it is active iff it was the last member to have a value stored to it. This suggests that the following code should be undefined at the points I mark.

My question is if I am correct in my understanding of when it is defined to access a member of a union, particularly in the following situations.

#include <stddef.h>
#include <stdio.h>

void work_with_ints(int* p, size_t k)
{
   size_t i = 1;
   for(;i<k;++i) p[i]=p[i-1];
}

void work_with_floats(float* p, size_t k)
{
   size_t i = 1;
   for(;i<k;++i) p[i]=p[i-1];
}

int main(void)
{

   union{ int I[4]; float F[4]; } u;

   // this is undefined because no member of the union was previously
   // selected by storing a value to the union object
   work_with_ints(u.I,4);
   printf("%d %d %d %d\n",u.I[0],u.I[1],u.I[2],u.I[3]);

   u.I[0]=1; u.I[1]=2; u.I[2]=3; u.I[3]=4;

   // this is undefined because u currently stores an object of type int[4]
   work_with_floats(u.F,4);
   printf("%f %f %f %f\n",u.F[0],u.F[1],u.F[2],u.F[3]);

   // this is defined because the assignment makes u store an object of
   // type F[4], which is subsequently accessed
   u.F[0]=42.0;
   work_with_floats(u.F,4);
   printf("%f %f %f %f\n",u.F[0],u.F[1],u.F[2],u.F[3]);

   return 0;
}

Am I correct in the three items I have noted?

My actual example is not possible to use here due to size, but it was suggested in a comment that I extend this example to something compileable. I compiled and ran the above in both clang (-Weverything -std=c11) and gcc (-pedantic -std=c11). Each gave the following:

0 0 0 0
0.000000 0.000000 0.000000 0.000000
42.000000 42.000000 42.000000 42.000000

That seems appropriate, but that does not mean the code is compliant.

EDIT:

To clarify what the code is doing, I will point out the exact instances where the property I mention in the first paragraph is applied.

First, the contents of an uninitialized union are read and modified. This is undefined behavior, rather than unspecified with a potential for UB with traps, if the principle I mention in the first paragraph is true.

Second, the contents of a union are used with the type of an inactive union member. Again, this is undefined behavior, rather than unspecified with a potential for UB with traps, if the principle I mention in the first paragraph is true.

Third, the item just mentioned as "second" produces unspecified behavior with a potential for UB with traps, if first one element of the array contained in the inactive member is modified. This makes the whole array the active member, hence the change in definedness.

I am demonstrating the consequences of the principle in the first paragraph of this question, to show how that principle, if correct, affects the nature of the C standard. Consequent the significant effect on the nature of the standard in some circumstances, I am looking for help in determining if the principle I have stated is a correct understanding of the standard.

EDIT:

I think it may help to describe how I get from the standard the principle in the first paragraph above, and how one might disagree. Not much is said on the matter in the standard, so there has to be some filling in of the gaps no matter what.

The standard describes a union as holding one object at a time. This seems to suggest treating it like a structure containing one element. It seems that anything deviating from that interpretation deserves mention. That is how I get to the principle I have stated.

On the other hand, the discussion of effective type does not define the term "declared type". If that term is understood such that union members do not have a declared type, then it could be argued that each subobject of a union need be interpreted as another member recursively. So, in the last example in my code, all floating point array members would need to be initialized, not just the first.

The two examples I give of undefined behavior are important to me to resolve. However, the last example, which relates to the above paragraph, seems most crucial. I could really see an argument either way there.

EDIT:

This is not a type punning question. First, I am talking about writing to unions, not reading from them. Second, I am talking about the validity of doing these writes with a pointer rather than with the union type. This is very different from type punning issues.

This question is more related to strict aliasing than it is to type punning. You can not access memory however you want due to strict aliasing. This question deals with exactly how unions ease the constraints of strict aliasing on their members. It is not said they ever do that, but if they don't then you could never do something like the following.

union{int i} u; u.i=0; function_working_with_an_int_pointer (&u.i);

So, clearly, unions affect the application of strict aliasing rules in some cases. My question is to confirm that the line I have drawn according to my reading of the standard is correct.

Having your questions as a code comment makes it hard for visitors to see what you're asking about; and it makes it harder for people helping you to follow what's going on. Move it from there to an actual question outside your code and remove that last paragraph. The fact that you had to type that was a smell that should have told you you were doing it wrong. — George Stocker, Jul 21 '16 at 04:50
@GeorgeStocker, I have made changes consistent with what you have said. Can you please take it off hold? — Kyle, Jul 21 '16 at 04:58
Edit of the question with additional code has turned things around completely, because the functions read in addition to writing. I deleted my answer, because reading from unassigned members is indeed UB. — Sergey Kalinichenko, Jul 21 '16 at 09:11
C11 draft standard n1570: *6.5.2.3 Structure and union members 3 A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member[...] 95) If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.* — EOF, Jul 21 '16 at 12:59
@dasblinkenlight, is it UB by omission only, or is there something more direct in the standard? — Kyle, Jul 21 '16 at 13:27
@EOF, I don't see how that footnote affects things. The punning is through pointers taken from the union, and not through the `.` or `->` operators. Also, in the first situation, unspecified bits are punned, so to speak, and that is not covered by that footnote eighter. — Kyle, Jul 21 '16 at 13:29
@Kyle: `F` and `I` in your union are not pointers, they're arrays. — EOF, Jul 21 '16 at 13:32
@EOF, they are pointers by the time the functions do anything with them. — Kyle, Jul 21 '16 at 13:59
In the latest edit, you mention *"the first paragraph from the standard"*. Please edit the question to specify the section/paragraph that you are referring to. — user3386109, Jul 21 '16 at 17:34
I meant that with the following emphasis. "I get the principle in the first paragraph **from** the standard". I will reword it to make that clear. — Kyle, Jul 21 '16 at 17:44
Possible duplicate of [Is type-punning through a union unspecified in C99, and has it become specified in C11?](http://stackoverflow.com/questions/11639947/is-type-punning-through-a-union-unspecified-in-c99-and-has-it-become-specified) — Gilles 'SO- stop being evil', Jul 21 '16 at 17:50
@Gilles, the behavior I am discussing is not type punning. I am talking about writing to a union member through a pointer, not reading from it. It is also significant that I am using a pointer rather than directly using the union type. — Kyle, Jul 21 '16 at 17:52
There is no sensible reason why the above shouldn't work, and the authors of the Standard intended that quality implementations would behave sensibly when practical even in cases where the Standard didn't mandate it. Unfortunately, a contrary mentality has become fashionable. — supercat, Jul 21 '16 at 19:12

score 0 · Answer 1 · answered Jul 21 '16 at 17:36

0

an object contained in a union can only be used if it is active, and that it is active iff it was the last member to have a value stored to it.

The statement is false. The behavior is reliable and defined.

  union {
    unsigned char  c [4];
    long           d;
  } v;

  v .d = 0xaabbccddL;

  printf ("%x\n", v .c [2]);

It is completely acceptable to access the c member even though it was not the last assigned. On a little endian machine, it will definitely show bb and on a big endian machine, cc.

answered Jul 21 '16 at 17:36

wallyk

56,922
16
83
148

You are reading from a different member. That is defined. My example deals with writing to a different member, and doing so without directly using the union type. – Kyle Jul 21 '16 at 17:49
@Kyle: I don't understand: I write to one member and read from another. How would you alter my example to match your question? – wallyk Jul 21 '16 at 17:54
1

I'd change `c` to a non character type, because there are special allowances related to character types. Then I'd write to one member, then write to another through a pointer. My concern is whether or not that second write is valid. But that is an oversimplification of my question. I don't know how to give a more concise example than what I have given, that will properly demonstrate the issue. – Kyle Jul 21 '16 at 18:00
Important note: This is one of the areas where C and C++ differ. The cited statement is 100% correct in C++, the defined behavior is available only in C. – Ben Voigt Feb 21 '18 at 22:36

Is this interpretation of union behavior accurate?

EDIT:

EDIT:

EDIT:

1 Answers1