2

The C11 standard section 6.2.5.20 defines array as:

An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type.

while struct is defined as:

A structure type describes a sequentially allocated nonempty set of member objects (and, in certain circumstances, an incomplete array), each of which has an optionally specified name and possibly distinct type.

The 6.7.2.1 section says that padding could be inserted between fileds:

Each non-bit-field member of a structure or union object is aligned in an implementation- defined manner appropriate to its type.

Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

But does this all mean that the following objects could have different memory layouts?

struct A {
    char x0;
    short x1;
};

struct B {
    struct A x0;
    struct A x1;
    struct A x2;
};

assert(sizeof(struct B) == sizeof(struct A[3]));

I created this test script to check the memory layout for GCC:

import itertools
import subprocess

src = """
#include "assert.h"

struct A {
{fields}
};

struct B {
    struct A x0;
    struct A x1;
    struct A x2;
};

int main(int argc, char** argv) {
    assert(sizeof(struct B) == sizeof(struct A[3]));
    return 0;
}
"""

def main():
    all_types = ["char", "short", "int", "long long"]

    for types in itertools.product(all_types, repeat=3):
        rendered = src.replace("{fields}", "".join([
            "        {} x{};\n".format(t, i)
            for i, t in enumerate(types)]))
        with open("main.c", "w") as f:
            f.write(rendered)
        subprocess.call(["gcc", "main.c"])
        subprocess.call(["./a.out"])

if __name__ == "__main__":
    main()

But GCC always produces the same memory layout for the array and the structure.

  • Are there any real world examples when the layout is different?
  • Is is safe to cast such structure instance to the array?
  • Would it be safer with union?
ivaigult
  • 6,198
  • 5
  • 38
  • 66

1 Answers1

2

The difference is that an array, two elements have to be contiguous, with no interleaving padding, while in a struct they are sequential, but padding can be present, in an implementation defined way.

Now for your questions:

Are there any real world examples when the layout is different?

AFAIK, not with common compilers. In addition, most have options by which a programmer can ask to add no padding in a struct.

Is is safe to cast such structure instance to the array?

No because a struct does not declare an equivalent array, and a single variable can only be aliased to an array of size 1. So is a is a single variable, *(&a + 1) is formally Undefined Behaviour.

Would it be safer with union?

Yes, according of that other SO post it can be done through an union. This is legal C:

union B {
    struct {
        struct A x0;
        struct A x1;
        struct A x2;
    };
    struct A x[3];
};

Even if the standard does not guarantee it, common compiler never add padding between elements of same type, be the type simple or derived (struct). Same reason as for the first question.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • But the union approach still assumes that the anonymous struct is not padded, right? What if filed types are compound, e.g. other structures? – ivaigult Feb 27 '19 at 16:27
  • @ivaigult: yes I assume here that no padding is involved, which is not guaranteed per standard, but most if not all compiler respect that. In fact, they just have no reason to add padding between elements of same type. – Serge Ballesta Feb 27 '19 at 17:02