Common initial sequence and alignment

Question

While thinking of a counter-example for this question, I came up with:

struct A
{
    alignas(2) char byte;
};

But if that's legal and standard-layout, is it layout-compatible to this struct B?

struct B
{
    char byte;
};

Furthermore, if we have

struct A
{
    alignas(2) char x;
    alignas(4) char y;
};
// possible alignment, - is padding
// 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15
//  x  -  -  -  y  -  -  -  x  -  -  -  y  -  -  -

struct B
{
    char x;
    char y;
}; // no padding required

union U
{
    A a;
    B b;
} u;

Is there a common initial sequence for A and B? If so, does it include A::y & B::y? I.e., may we write the following w/o invoking UB?

u.a.y = 42;
std::cout << u.b.y;

(answers for C++1y / "fixed C++11" also welcome)

See [basic.align] for alignment and [dcl.align] for the alignment-specifier.
[basic.types]/11 says for fundamental types "If two types T1 and T2 are the same type, then T1 and T2 are layout-compatible types." (an underlying question is whether A::byte and B::byte have layout-compatible types)
[class.mem]/16 "Two standard-layout struct types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout-compatible types."
[class.mem]/18 "Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-field or both are bit-fields with the same width for a sequence of one or more initial members."
[class.mem]/18 "If a standard-layout union contains two or more standard-layout structs that share a common initial sequence, and if the standard-layout union object currently contains one of these standard-layout structs, it is permitted to inspect the common initial part of any of them."

Of course, on a language-lawyer level, another question is what it means that the inspection of the common initial sequence is "permitted". I guess some other paragraph might make the above u.b.x undefined behaviour (reading from an uninitialized object).

I don't think this is a good example. The structure with an `int` and a `char` has `int` alignment. That `alignas(2)` attribute for `char byte` as a first element is a no-op because that first element already has `alignas(int)` alignment. A possibly better example: `struct A {int x; alignas(double) char byte;};` — David Hammen, Feb 01 '14 at 17:12
@DavidHammen Ouch, true, I've meant to add padding *after* the byte. Fixing.. — dyp, Feb 01 '14 at 19:28
There is no padding in front of x (regarding ASCII-art where x is at 02) — , Feb 01 '14 at 20:09
@DieterLücking Hmm no that would be illegal. There can be no padding at the beginning of a standard-layout struct. But the Standard doesn't allow the "odd" alignment I had in mind either, so I've removed that line. The remaining one is the alignment g++ and clang++ seem to be using. — dyp, Feb 01 '14 at 20:39
Hmm. I thought had an answer, but then I thought some more. The more I look at the standard there appears to be a misalignment problem. Is `alignas` a part of the *type-id* or not? In some places it appears that this is the case, in others, it appears that this definitely is not the case. — David Hammen, Feb 01 '14 at 21:03
@DavidHammen Yeah.. I started wondering about the whole issue when I tried [`static_assert(std::is_same::value, "!");`](http://coliru.stacked-crooked.com/a/88465649d58ba91b) which then lead to this question. — dyp, Feb 01 '14 at 21:14
Side note: If a class using `alignas` on its members is not intended to be standard-layout, then `sizeof(A)` could be four, with the second member at offset 0, and the first at offset 2. Somewhat more relevant note: the current wording of "standard-layout" that already makes the literal requirements unimplementable for other reasons. [Details here.](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1672) I looked for open issues regarding alignment too, but found nothing of interest. — , Feb 22 '14 at 21:42
Nasty. But yeah, looks like the Standard doesn't address this well enough. The "obvious" intent is to define layout-compatible structs and common initial sequences as involving the same base classes with same alignments and the same member types with same alignments. — aschepler, May 02 '14 at 23:50
@Tshepang According to the tag wiki, [union] is for SQL UNION, whereas [unions] is for C, C++ etc. `union`s. — dyp, May 02 '14 at 23:55
That feels forced @dyp. We need better tags, maybe [tag:c-union]. — tshepang, May 03 '14 at 00:16
@Tshepang I agree. Maybe [union] should be replaced by [SQL-UNION] and [unions] by [c-union]. Maybe there's been some discussion on meta? Otherwise, it might be worth a new question there. Edit: just upvoted your suggestion :) — dyp, May 03 '14 at 00:23

score 2 · Answer 1 · edited Jan 27 '22 at 07:18

I may not speak for C++11 standard, but I am a firmware/microchip programmer and have had to use such features that exist for a long time (pragma pack, alignment attributes).

Using alignas cannot be considered "standard layout", thus all the implications are useless. Standard layout means one fixed alignment distribution (per architecture - usually all is align(min(sizeof,4)) or some may be align(8)). The standard probably wants to say what is obvious: without using special features (align,pack) structures are compatible on the same architecture if they appear to be the same (same types in same order). Otherwise, they may or may not be compatible - depending on architecture (may be compatible on one architecture but different on another).

Consider this struct:

struct foo{ char b; short h; double d; int i; };

On one architecture (e.g. x86 32bit) it is what it seems to be, but on Itanium or ARM it actually looks like this:

struct foo{char b, **_hidden_b**; short h; **int _maybe_hidden_h**; double d; int i;}

Notice _maybe_hidden_h - it can be omitted in older AEABI (align to max 4) or there for 64bit/8B alignment.

x86 Standard Layout (pack(1)):

alignas(1) char b; alignas(1) short h; alignas(1) double d; alignas(1) int i;

32bit Alignment Standard Layout (pack(4) - ARM architecture, older version - EABI)

alignas(1) char b; alignas(2) short h; **alignas(4) double d**; alignas(4) int i;

64bit Alignment Standard Layout (pack(8) - Itanium and newer ARM/AEABI)

alignas(1) char b; alignas(2) short h; **alignas(8) double d**; alignas(4) int i;

To your example:
offsetof(A,y) = 4 while offsetof(B,y) = 2 and the union does not change that (thus &u.a.y != u.b.y)

score 2 · Answer 2 · answered Jul 27 '14 at 12:10

It looks like a hole in the standard. The responsible thing would be to file a defect report.

Several things, though:

Your first example doesn't really demonstrate a problem. Adding a short after the char would also have the effect of aligning the char to a 2-byte boundary, without changing the common subsequence.
alignas is not C++-only; it was added simultaneously to C11. Since the standard-layout property is a cross-language compatibility facility, it is probably preferable to require corresponding alignment specifiers to match than to disqualify a class with a nonstatic member alignment-specifier.
There would be no problem if the member alignment specifiers appertained to the types of the members. Other problems may result from the lack of adjustment to types, for example a function parameter ret fn( alignas(4) char ) may need to be mangled for the ABI to process it correctly, but the language might not provide for such adjustment.

Oops, of course. The first example was the more language-lawyery formulation of the problem that manifests in the second example. — dyp, Jul 27 '14 at 12:24

score 0 · Answer 3 · 2014-05-02T19:27:55.083

0

(an underlying question is whether A::byte and B::byte have layout-compatible types)

Yes. This is the essential part. The alignas-attribute appertains to the entity declared, not the type. Can be easily tested by std::is_same and decltype.

I.e., may we write the following w/o invoking UB?

This is therefore not UB, the relevant paragraphes have been quoted by you.

EDIT: Pardon me, this can of course result in UB because the padding between members is not (or implementation-) defined (§9.2/13)! I accidently misread the example, because i thought it accessed x instead of y, because with x it actually always works - whereas with y it theoretically doesn't have to (though it practically always will).

edited May 02 '14 at 19:27

answered May 02 '14 at 15:44

How is that implemented, then? `u.a.y = 42;` writes to the second byte of the structure; if `u.b.y` shall contain the same value one would need either to track the active member of the union or also write to the fifth byte of the structure in `u.a.y = 42;`, right? – dyp May 02 '14 at 16:19
Oh, wait a minute - i just realized you did a completely different example than i thought. You use y instead of x. Well now that is NOT defined, since there can be any padding between members! I'll add that to the post as well :) – May 02 '14 at 19:24
Yes, the padding between members is implementation-defined. But if there's a common initial sequence, you might access all those members of this sequence (not just the first). The question now is: What is the common initial sequence for those two structs? Does it include the second member? I'd say *no*, but where is this specified? – dyp May 02 '14 at 19:51
You gave the necessary quote yourself: If the members have layout-compatible types. And since the types are the same, you may modify them. Alignment is automatically taken from the strictest requirement from all the union members. Still i reckon this being weird, since this would make assumptions over the compability of structs... – May 02 '14 at 20:33
So you mean that `u.a.y` will also have an alignment of 4 since `u.a` in the same union as `u.b`? Then, other assignments like `A x; x = u.a;` had to be changed. I'm not yet convinced how this all is intended to fits together. – dyp May 02 '14 at 20:46
Yes, it will have the same alignment. Otherwise this rule couldn't apply. *Or* the standard does somehow not specify precisely enough, how layout-compability is related to alignment of the objects declared with `alignas`. – May 02 '14 at 22:02
I suspect the latter is the case ;) That's why I keep asking.. Well maybe I should ask this on the isocpp mailing list to see whether it's considered a defect or not. – dyp May 02 '14 at 23:12

Common initial sequence and alignment

3 Answers3

Linked