In these two examples, does accessing members of the struct by offsetting pointers from other members result in Undefined / Unspecified / Implementation Defined Behavior?
struct {
int a;
int b;
} foo1 = {0, 0};
(&foo1.a)[1] = 1;
printf("%d", foo1.b);
struct {
int arr[1];
int b;
} foo2 = {{0}, 0};
foo2.arr[1] = 1;
printf("%d", foo2.b);
Paragraph 14 of C11 § 6.7.2.1 seems to indicate that this should be implementation-defined:
Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type.
and later goes on to say:
There may be unnamed padding within a structure object, but not at its beginning.
However, code like the following appears to be fairly common:
union {
int arr[2];
struct {
int a;
int b;
};
} foo3 = {{0, 0}};
foo3.arr[1] = 1;
printf("%d", foo3.b);
(&foo3.a)[1] = 2; // appears to be illegal despite foo3.arr == &foo3.a
printf("%d", foo3.b);
The standard appears to guarantee that foo3.arr
is the same as &foo3.a
, and it doesn't make sense that referring to it one way is legal and the other not, but equally it doesn't make sense that adding the outer union with the array should suddenly make (&foo3.a)[1]
legal.
My reasoning for thinking the first examples must also therefore be legal:
foo3.arr
is guaranteed to be the same as&foo.a
foo3.arr + 1
and&foo3.b
point to the same memory location&foo3.a + 1
and&foo3.b
must therefore point to the same memory location (from 1 and 2)- struct layouts are required to be consistent, so
&foo1.a
and&foo1.b
should be laid out exactly the same as&foo3.a
and&foo3.b
&foo1.a + 1
and&foo1.b
must therefore point to the same memory location (from 3 and 4)
I've come across some outside sources that suggest that both the foo3.arr[1]
and (&foo3.a)[1]
examples are illegal, however I haven't been able to find a concrete statement in the standard that would make it so.
Even if they were both illegal though, it's also possible to construct the same scenario with flexible array pointers which, as far as I can tell, does have standard-defined behavior.
union {
struct {
int x;
int arr[];
};
struct {
int y;
int a;
int b;
};
} foo4;
The original application is considering whether or not a buffer overflow from one struct field into another is strictly speaking defined by the standard:
struct {
char buffer[8];
char overflow[8];
} buf;
strcpy(buf.buffer, "Hello world!");
println(buf.overflow);
I would expect this to output "rld!"
on nearly any real-world compiler, but is this behavior guaranteed by the standard, or is it an undefined or implementation-defined behavior?