Aliasing struct and array the conformant way

Question

In the old days of pre-ISO C, the following code would have surprized nobody:

struct Point {
    double x;
    double y;
    double z;
};
double dist(struct Point *p1, struct Point *p2) {
    double d2 = 0;
    double *coord1 = &p1.x;
    double *coord2 = &p2.x;
    int i;
    for (i=0; i<3; i++) {
        double d = coord2[i]  - coord1[i];    // THE problem
        d2 += d * d;
    return sqrt(d2);
}

At that time, we all knew that alignment of double allowed the compiler to add no padding in struct Point, and we just assumed that pointer arithmetics would do the job.

Unfortunately, this problematic line uses pointer arithmetics (p[i] being by definition *(p + i)) outside of any array which is explicitely not allowed by the standard. Draft n1570 for C11 says in 6.5.6 additive operators §8:

When an expression that has integer type is added to or subtracted from a pointerpointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression...

As nothing is said when we have not two elements of the same array, it is unspecified by the standard and from there Undefined Behaviour (even if all common compilers are glad with it...)

Question:

As this idiom allowed to avoid code replication changing just x with y then z which is quite error prone, what could be a conformant way to browse the elements of a struct as if they were members of the same array?

Disclaimer: It obviously only applies to elements of same type, and padding can be detected with a simple static_assert as shown in that other question of mine, so padding, alignment and mixed types are not my problem here.

Note well that even in pre-ISO C days, "alignment of double allowed the compiler to add no padding" is not at all the same thing as "alignment of double required the compiler to avoid adding any padding". Even at that time, code that relied on the latter was dependent on compiler implementation. I'm sure it worked reliably, and it probably *still* works reliably, but the reason it was not formally undefined behavior was that C's behavior had not been fully formalized in the first place. — John Bollinger, Jan 22 '18 at 15:05
@JohnBollinger: I am aware of that, that's the reason why I said *it would have surprized nobody* and not *it used to be legal code*... — Serge Ballesta, Jan 22 '18 at 15:22
A pre-ISO C guy here. No I would not have been surprised. I would have asked the perpetrator to use an appropriate *indexed variable* in lieu of the unmaintainable ad-hoc collection of x, y and z. Having said that, I would not hesitate to add things like `#define x coord[0]` to the fixed codebase back then (but not these days). Perhaps `enum { x = 0, y = 1, z = 2 }` and `point.coord[x]` etc — n. m. could be an AI, Jan 22 '18 at 15:53

score 6 · Accepted Answer · answered Jan 22 '18 at 15:29

C does not define any way to specify that the compiler must not add padding between the named members of struct Point, but many compilers have an extension that would provide for that. If you use such an extension, or if you're just willing to assume that there will be no padding, then you could use a union with an anonymous inner struct, like so:

union Point {
    struct {
        double x;
        double y;
        double z;
    };
    double coords[3];
};

You can then access the coordinates by their individual names or via the coords array:

double dist(union Point *p1, union Point *p2) {
    double *coord1 = p1->coords;
    double *coord2 = p2->coords;
    double d2 = 0;

    for (int i = 0; i < 3; i++) {
        double d = coord2[i]  - coord1[i];
        d2 += d * d;
    }
    return sqrt(d2);
}

int main(void) {
    // Note: I don't think the inner braces are necessary, but they silence
    //       warnings from gcc 4.8.5:
    union Point p1 = { { .x = .25,  .y = 1,  .z = 3 } };
    union Point p2;

    p2.x = 2.25;
    p2.y = -1;
    p2.z = 0;

    printf("The distance is %lf\n", dist(&p1, &p2));

    return 0;
}

Yes, anonymous struct members are great! – Serge Ballesta Jan 22 '18 at 15:58 — Serge Ballesta, Jan 22 '18 at 15:58

score 1 · Answer 2 · answered Jan 22 '18 at 16:10

This is mainly a complement to JohnBollinger's answer. Anonymous struct members do allow a clean and neat syntax, and C defines a union as a type consisting of a sequence of members whose storage overlap (6.7.2.1 Structure and union specifiers §6). Accessing a member of a union is then specified in 6.5.2.3 Structure and union members:

3 A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member,⁹⁵⁾ and is an lvalue if the first expression is an lvalue.

and the (non normative but informative) note 95 precises:

95) If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

That means that for the current version of the standard the aliasing of a struct by an array with the help of anonymous struct member in a union is explicitly defined behaviour.

In addition to the type-punning being well-defined, you can protect against unexpected alignment hiccups with `static_assert(sizeof(double[3]) == sizeof(union Point), "Weird computers not supported");` — Lundin, Jun 25 '19 at 08:05

Aliasing struct and array the conformant way

2 Answers2

Linked

Related