2

Let's say that I have the following C++ code:

struct something
{
  // ...
  union { int size, length; };
  // ...
};

This would create two members of the struct which access the same value: size and length.

Would treating the two members as complete aliases (i.e. setting the size, then accessing the length and vice/versa) be undefined behaviour? Is there a "better" way to implement this type of behaviour, or is this an acceptable implementation?

timrau
  • 22,578
  • 4
  • 51
  • 64
Undeterminant
  • 1,210
  • 1
  • 8
  • 14
  • Why not just have one of the two? This will just cause confusion, especially because there can be a difference between size and length for certain containers. – Overv Mar 07 '13 at 17:29
  • 1
    It will work but it will certainly confuse anyone who's looking at code using `something`, but not looking at the struct definition. – Drew Dormann Mar 07 '13 at 17:33
  • @DrewDormann - I agree. Why make things more confusing than they have to be? – Steve Wellens Mar 07 '13 at 17:35
  • @Overv I used size and length as a simple example to demonstrate the point, but the main reason is because I was experimenting with an N-dimensional vector/point class and I wanted to be able to access values with x, y, and z while the variables are actually named something like vec<0>::val, vec<1>::val, etc. in the background. – Undeterminant Mar 07 '13 at 17:37
  • 1
    @AlexCharron Make x, y and z member functions that return those values. As Luchian points out, writing to one union member and reading from another is technically undefined behavior in C++. But I'd be very surprised to find *any* C++ compiler that behaved unexpectedly when you do that. – Praetorian Mar 07 '13 at 17:40
  • @LuchianGrigore It appears that this is answered in one of the unaccepted answers for that question, but this question is slightly different from that one, I think. – Undeterminant Mar 07 '13 at 17:46

3 Answers3

5

It is not undefined behavior. Both of the aliases in the union will be accessing the same location in the memory. See below:

§9.2/18 If a standard-layout union contains two or more standard-layout structs that share a common initial sequence, and if the standard-layout union object currently contains one of these standard-layout structs, it is permitted to inspect the common initial part of any of them. Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-field or both are bit-fields with the same width for a sequence of one or more initial members.

It is undefined if types have different initial sequence.

meyumer
  • 5,063
  • 1
  • 17
  • 21
  • 1
    It is undefined behavior - see http://stackoverflow.com/questions/11373203/accessing-inactive-union-member-undefined – Luchian Grigore Mar 07 '13 at 17:34
  • 1
    But that states that it's undefined because it's a byte-to-byte copy, which I believe is still valid if it's from the same type to the next. This is a sort of grey area, which is why I asked. – Undeterminant Mar 07 '13 at 17:40
  • 1
    it's a bit odd that if you put both `size` and `length` inside of a struct (but both still in the union, separately), it might actually be defined...but having them naked seems not to be – Stephen Lin Mar 07 '13 at 17:43
  • 1
    It is defined behavior. It is undefined if types have different initial sequence. – meyumer Mar 07 '13 at 17:43
  • 1
    @meyumer at least according to the quotes on that page, that only applies to "standard-layout structs", unless there's another standard quote that address this for scalar types (do you have one?) – Stephen Lin Mar 07 '13 at 17:45
  • 1
    @LuchianGrigore: there is no time punning here because both `size` and `length` have the same time; or was it another issue that you were addressing ? – Matthieu M. Mar 07 '13 at 17:53
  • @meyumer: Actually, I had mis-numbered the quote, it is */18* so you might want to correct it. – Matthieu M. Mar 08 '13 at 07:48
5

Yes, this is allowed and well-defined. According to §3.10 [basic.lval]:

10/ If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

— the dynamic type of the object

[...]

Since here we store an int and read through an int, we access the object through a glvalue of the same dynamic type than the object, thus things are fine.


There even is a special caveat in the Standard for structures that share the same prefix. Or, in standardese, standard-layout types that share a common initial sequence.

§9.2/18 If a standard-layout union contains two or more standard-layout structs that share a common initial sequence, and if the standard-layout union object currently contains one of these standard-layout structs, it is permitted to inspect the common initial part of any of them. Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-field or both are bit-fields with the same width for a sequence of one or more initial members.

That is:

struct A { unsigned size; char type; };
struct B { unsigned length; unsigned capacity; };

union { A a; B b; } x;

assert(x.a.size == x.b.length);

EDIT: Given that int is not a struct (nor a class) I am afraid it's actually not formally defined (I certainly could not see anything in the Standard), but should be safe in practice... I've brought the matters to the isocpp forums; you might have found a hole.

EDIT: Following the above mentionned discussion, I have been shown §3.10/10.

Community
  • 1
  • 1
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 1
    does a scalar type count as a "standard-layout struct"? – Stephen Lin Mar 07 '13 at 17:48
  • 1
    @StephenLin: Not sure (not `struct`), they are certainly *standard-layout types* and they are *layout-compatible* though. – Matthieu M. Mar 07 '13 at 17:55
  • OK, your example in the answer is clearly defined but the example in his question might not be (strictly, it might be a defect in the standard), the exception only seems to apply if you access through members of "standard layout structs" which is not the case if they're naked, unless there's some special language that says a scalar type is equivalent to its corresponding single-member struct – Stephen Lin Mar 07 '13 at 17:58
  • 1
    @StephenLin: I think there is a guarantee on the representation of *standard-layout* struct containing a single member; but I agree I would prefer a more explicit statement regarding "naked" types. It seems obvious given §9.2/19; but explicit trumps "obvious". – Matthieu M. Mar 07 '13 at 18:01
  • 1
    I agree, if you put two and two together there doesn't seem to be any way for an implementation to satisfy the requirements and *not* have this behavior, short of manually interfering with lookup on purpose, but it would be better if there were an explicit guarantee (it seems like an oversight) – Stephen Lin Mar 07 '13 at 18:02
  • @StephenLin: or maybe I just missed it :( It's so hard to navigate the standard :( – Matthieu M. Mar 07 '13 at 18:07
  • @StephenLin I don't see any requirements that would be violated if the runtime kept track of the active member and inserted code on every read of a union member to check if that member is active and crash or zero out the member if it's not. – bames53 Mar 07 '13 at 18:58
  • @bames53 yeah, that's the "short of manually interfering with lookup on purpose" part; sure, it's legal, but it's not sane (especially since you have to support the common initial sequence case anyway) – Stephen Lin Mar 07 '13 at 18:59
  • @StephenLin Okay, I think that's sufficient to say that the behavior is not defined. I'm not sure if it quite rises to the level of 'Undefined Behavior' where the C++ standard imposes no requirements whatsoever on the entire program though. In any case, it seems clear that this answer is wrong. – bames53 Mar 07 '13 at 19:06
  • @barnes53 you should take it up with MatthieuM., then – Stephen Lin Mar 07 '13 at 19:07
  • 1
    @bames53: I've brought the matter to the isocpp forums, see https://groups.google.com/a/isocpp.org/forum/?fromgroups=#!topic/std-discussion/9zp4TbUiugw I think it's undefined in the sense that it's not defined (!) (because admittedly somewhat silly), but safe in practice because of the requirements brought by 9.2/18 – Matthieu M. Mar 07 '13 at 19:37
  • @MatthieuM. I've been reading and I think I found something that does define this behavior as expected. Look at 3.8/7. This depends on particular interpretations of 3.8/1 for when the lifetime of union members starts and end (i.e., on the definitions of 'obtain' and 'reuse') – bames53 Mar 07 '13 at 19:47
  • @bames53: actually, §3.10/10 might be the key. It speaks of accessing the stored value of an object through a glvalue, and explicitly white-list valid usecases, such as having the same type as the dynamic type of the object. – Matthieu M. Mar 08 '13 at 07:44
  • I'm not sure 3.10/10 even applies. If a non-active member's lifetime has ended (due to the storage being reused for another object, 3.8/1) then using that glvalue may not be defined to access the stored value of the other object. – bames53 Mar 08 '13 at 18:07
0

Values will be same. If you assign 5 to size then length will also be 5.

CasperGhost
  • 117
  • 7
  • "Values will be same" also on nonstandard compilers that allow type punning. This answer does not address the question. – underscore_d Dec 31 '15 at 15:22