Under 6.5.6 Additive operators:
Semantics
8 - [...] If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. [...] If the result points one past the last element of the array object, it shall not be used as the operand of a unary *
operator that is evaluated.
If the memory is allocated by malloc
then:
7.22.3 Memory management functions
1 - [...] The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation.
This does not however countenance the use of such memory without an appropriate cast, so for MyStruct
as defined above only the declared members of the object can be used. This is why flexible array members (6.7.2.1:18) were added.
Also note that appendix J.2 Undefined behavior calls out array access:
1 - The behavior is undefined in the following circumstances: [...]
— Addition or subtraction of a pointer into, or just beyond, an array object and an
integer type produces a result that does not point into, or just beyond, the same array
object.
— Addition or subtraction of a pointer into, or just beyond, an array object and an
integer type produces a result that points just beyond the array object and is used as
the operand of a unary *
operator that is evaluated.
— An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7]
given the declaration int
a[4][5])
.
So, as you note this would be undefined behaviour:
MyStruct *foo = malloc(sizeof(MyStruct) + sizeof(int) * 10);
foo->data[5] = 1;
However, you would be allowed to do the following:
MyStruct *foo = malloc(sizeof(MyStruct) + sizeof(int) * 10);
((int *) foo)[(offsetof(MyStruct, data) / sizeof(int)) + 5] = 1;
C++ is laxer in this regard; 3.9.2 Compound types [basic.compound] has:
3 - [...] If an object of type T
is located at an address A
, a pointer of type cv T*
whose value is the address A
is said to point to that object, regardless of how the value was obtained.
This makes sense considered in the light of C's more aggressive optimisation opportunities for pointers, e.g. with the restrict
qualifier.