5

Pointer arithmetics between consecutive members of same type in a struct used to be a common practice while pointer arithmetics is only valid inside an array. In C++ it would be explicitely Undefined Behaviour because an array can only be created by a declaration or a new expression. But C language defines an array as a contiguously allocated nonempty set of objects with a particular member object type, called the element type. (n1570 draft for C11, 6.2.5 types §20). So provided we can make sure that that the members are consecutive (meaning no padding between them) it could be legal to see that as an array.

Here is a simplified example, that compiles without a warning and gives expected results at run time:

#include <stdio.h>
#include <stddef.h>
#include <assert.h>

struct quad {
    int x;
    int y;
    int z;
    int t;
};

int main() {
    // ensure members are consecutive (note 1)
    static_assert(offsetof(struct quad, t) == 3 * sizeof(int),
        "unexpected padding in quad struct");
    struct quad q;
    int *ix = &q.x;
    for(int i=0; i<4; i++) {
        ix[i] = i;
    }
    printf("Quad: %d %d %d %d\n", q.x, q.y, q.z, q.t);
    return 0;
}

It does not really make sense here, but I have already seen real world example where iterating among members of a struct allows simpler code with less risk of typo.

Question:

In the above example, is the static_assert enough to make legal the aliasing of the struct with an array?


(note 1) As a struct describes a sequentially allocated nonempty set of member objects, later members must have increasing addresses. Simply the compiler could include padding between them. So the offset of last member (here t) if 3 times sizeof(int) plus the total padding before it. If the offset is exactly 3 * sizeof(int) then there is no padding in struct


The question proposed as a duplicate contains both an accepted answer that let think that it would be UB, and a +1 answer that let think that it could be legal because I could ensure that no padding could exist

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • No, it's undefined behavior. `int x;` is an array of size 1. Use an union with an array and your struct that should work... maybe. – Stargateur Jan 08 '18 at 09:35
  • Few days ago I get to know this and **it's UB** (if I remember correctly). – user2736738 Jan 08 '18 at 09:36
  • @P.P: The duplicate indeed addresses a similar problem, but here I could prove (at compile time) that the struct contains no padding. And an upvoted answer (at only +1) in the duplicate says that the padding was the problem. So I accept the question to be closed, but I would like to know whether or not it is UB *when the struct contains no padding* – Serge Ballesta Jan 08 '18 at 09:59
  • In the duplicate I commented [this](https://stackoverflow.com/questions/47224138/is-it-ok-to-access-past-the-size-of-a-structure-via-member-address-with-enough#comment81399627_47224596). That's the answer for "I could prove (at compile time) that the struct contains no padding" too. – P.P Jan 08 '18 at 10:02
  • @P.P.: The problem certainly lies there. What I means is that if an implementation adds no padding, then members should be consecutively allocated, and if its adds padding, then the static_assert should raise an error. Or should I ask a different question to ask whether the static_assert can detect padding in a struct? – Serge Ballesta Jan 08 '18 at 10:08
  • 1
    The answer with only one +1, say that in the case of the question where struct has **only** one member and there is no padding, this will not be UB... This is not your case and not the same context of your code, the point is the compiler is allow to assume that you will never use more that one element of `&q.x` because it's "an array of size one", so in your code you trigger out of bounds. Exemple with union that should not be UB, http://rextester.com/OLCE62527. – Stargateur Jan 08 '18 at 10:12
  • @Stargateur: would you be saying that my question is not exactly a duplicate? Just joking ;-) ... My question was certainly not clear enough at first. I tried to ask a more precise one [there](https://stackoverflow.com/q/48148477/3545273) – Serge Ballesta Jan 08 '18 at 10:37
  • 2
    @SergeBallesta The question is not but the answer of the question will answer you. This is the same problem. And answer to your question will quote the same sentence of the standard. – Stargateur Jan 08 '18 at 10:40

5 Answers5

4

No, it isn't legal to alias a struct and array like this, it violates strict aliasing. The work-around is to wrap the struct in a union, which contains both an array and the individual members:

union something {
  struct quad {
    int x;
    int y;
    int z;
    int t;
  };

  int array [4];
};

This dodges the strict aliasing violation, but you may still have padding bytes. Which you can detect with the static assert.

Another issue remains, and that is that you can't use pointer arithmetic on an int* pointing at the first member of the struct, for various obscure reasons outlined in the specified behavior of the additive operators - they require that the pointer points at an array type.

The best way to dodge all of this is to simply use the array member of the union above. This together with a static assert results in well-defined, rugged and portable code.

(In theory, you could also use a pointer to character type to iterate through the struct - unlike int* this would be allowed as per 6.3.2.3/7. But this is a more messy solution if you have no interest in the individual bytes.)

Lundin
  • 195,001
  • 40
  • 254
  • 396
2

The problem here is your definition of contiguously allocated: "we can make sure that that the members are consecutive (meaning no padding between them)".

Although that is a corollary of being contiguously allocated, it does not define the property.

Your structure members are separate variables with automatic storage duration, in a particular order with or without padding depending on how you are able to control your compiler, that's all. As such you can't use pointer arithmetic to reach one member given the address of another, and the behaviour on doing so is undefined.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • I tried to address the *with or without padding* part in my edit: if there was padding, the assert should raise an error. I think that your sentence *that is a corollary ...[but] does not define the property* is closer to what I need, but my English is not good enough to allow me to fully understand what you meant. – Serge Ballesta Jan 08 '18 at 10:02
1

I'm gonna argue UB. First and foremost, the mandatory quote from 6.5.6 Additive operators:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i-n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

I emphasized what I consider the crux of the matter. You are right when you say that an array object is "a contiguously allocated nonempty set of objects with a particular member object type, called the element type". But is the converse true? Does a consecutively allocated set of objects constitute an array object?

I'm going to say no. Objects need to be explicitly created.

So for your example, there is no array object. There are generally two ways to create objects in C. Declare them with automatic, static or thread local duration. Or allocate them and give the storage an effective type. You did neither to create an array. That makes the arithmetic officially undefined.

StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
  • But an array is only defined at 6.2.5.20. And it is perfectly legal to get a pointer to dynamic storage and use it as a pointer to an array to create objects in that array. That is my point: C has no `new` instruction to create a dynamic array – Serge Ballesta Jan 08 '18 at 10:45
  • @SergeBallesta - Had you been playing with allocated objects, the question would be harder to answer, IMO. But you declared an automatic object that is quite clearly not an array. Hence the problem, I'd say. – StoryTeller - Unslander Monica Jan 08 '18 at 10:47
  • @StoryTeller I think he did create an array (an array of size one i.e., see the second point in my answer). The issue here is that, that array is too small. – Ajay Brahmakshatriya Jan 08 '18 at 11:27
  • Also, can I ask how is allocated memory given an effective type? The allocator takes only the size not any type. Say I allocated `void* ptr = malloc(sizeof(struct quad));`, can I use it as an `int[4]` (assuming ofcourse if the assert holds)? – Ajay Brahmakshatriya Jan 08 '18 at 11:31
  • @AjayBrahmakshatriya - I don't think it is. If you look closely at 6.2.4 and 6.2.5, I think it's hinted quite strongly that an object can be created only in certain ways. None of those ways was used to create an array. – StoryTeller - Unslander Monica Jan 08 '18 at 11:31
  • @AjayBrahmakshatriya - [Effective type is given via a write access](http://port70.net/~nsz/c/c11/n1570.html#6.5p6) to objects without a declared type. – StoryTeller - Unslander Monica Jan 08 '18 at 11:32
  • @StoryTeller In this case an array is not created but the `[]` can be used with a pointer too, and here the pointer acts like it is pointing to 1 element. – Ajay Brahmakshatriya Jan 08 '18 at 11:33
  • @AjayBrahmakshatriya - Yes, but the question is broader. Can that pointer be incremented to access each element of the structure. The answer is no. There is no array super-object which would make the arithmetic valid. – StoryTeller - Unslander Monica Jan 08 '18 at 11:34
  • @StoryTeller that I totally agree with and I have mentioned the same in my answer. Also thanks for the link to the effective type section. – Ajay Brahmakshatriya Jan 08 '18 at 11:37
  • 1
    @StoryTeller: I accepted your answer, because your comment on allocated objects helped me a lot in understanding the real problem. – Serge Ballesta Jan 08 '18 at 12:36
1

To start with -

Quoting C11, chapter §6.5.2.1p2

A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). ...

Which means ix[i] evaluates to *(ix + i). A subexpression here is ix + i. ix has type pointer to integer.

Now,

Quoting C11, chapter §6.5.6p7

For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

We know thus that ix is pointing to an array of size one. And even constructing a pointer to beyond the length (except the off by one) is Undefined Behavior, let alone dereferencing it.

Which leads me to interpret that is indeed not allowed.

Ajay Brahmakshatriya
  • 8,993
  • 3
  • 26
  • 49
1

It would be UB. As established in that other question, the static_assert can test for possible padding in a conformant way. So yes the 4 members of the struct are indeed consecutively allocated.

But the real problem is that consecutive allocation is necessary but not enough to constitute an array. Even if I could not find a clear reference for it in C standard, objects cannot overlap during their lifetime - this is more clearly explicited in C++ standard. They can be members of an aggregate (struct or array) but aggregates are not allowed to overlap. This is coherent with the response to Defect Report #017 dated 10 Dec 1992 to C89 cited by Antti Haapala in its answer to the proposed duplicate.

Even if C has no new statement, allocated storage has has the particular property of having no declared type. That allows to create dynamically objects in that storage, but the lifetime of an allocated object ends when an object of a different type is created at its address. So even in allocated memory we cannot have at the same time both an array and a struct.

According to Lundin's answer, type punning through an union between an array and a struct should work, because a (non normative) note says

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type

and both type will have same representation: 4 consecutive integers

Without unions, an way to iterate through members of an array would be at the byte level because 6.3.2.3 Conversions/Pointers says:

7 ... When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

char *p = q;
for (i=0; i<4; i++) {
    int *ix = (int *) (p + i * sizeof(int));  // Ok: points to the expected int member
    *ix = i;
}

But pointer arithmetics on non character types to iterate over members of a struct is UB simply because individual members of a struct cannot be at the same time members of an array.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • Many thanks to other answerers that gave me hints to that. – Serge Ballesta Jan 08 '18 at 11:40
  • The C standard explicitly says that accessing a member of a union other than the last one stored reinterprets the storage as the new type. So even if there is no “object” of the new type there in some formal sense, the bytes are interpreted as if there were. That is in a non-normative footnote, but it is backed by long-standing history and supporting material. The normative text says the value of a **.** or **->** expression using the member is the value of the member—no qualification requiring it be the last stored member. – Eric Postpischil Jan 08 '18 at 12:05
  • "And even a union between a struct and an array would not allow to alias a struct and an array because one single member of an union can be active at the same time." This isn't really true, that's C++. The C language allows "type punning" between different union types, as described in 6.5.2.3/3 plus (non-normative) foot note 95. This is why my proposed solution with a union will work. – Lundin Jan 08 '18 at 12:09
  • @Lundin: Post edited, and thanks for the feed-back. I had already upvoted your answer, and you comment helped me to better understand the rationale behind it. – Serge Ballesta Jan 08 '18 at 12:27