Is multiple-level "struct inheritance" guaranteed to work everywhere?

Question

I know that in C, the first member of a struct is guaranteed to have no padding before it. Thus &mystruct == &mystruct.firstmember is always true.

This allows the "struct inheritance" technique, as described in this question:

typedef struct
{
    // base members

} Base;

typedef struct
{
    Base base;

    // derived members

} Derived;

// ... later
Base* object = (Base*) malloc(sizeof()); // This is legal

However, I'd like to make sure that this actually works safely with unlimited layers of "inheritance". E.g.:

typedef struct
{
    // members

} A;

typedef struct
{
    A base;

    // members

} B;

typedef struct 
{
    B base;

    // members
} C;

Are all of the following uses guaranteed to work?

A* a = (A*) malloc(sizeof(B));
A* a = (A*) malloc(sizeof(C));
B* b = (B*) malloc(sizeof(C));
C* c = malloc(sizeof(C));

// ... use and access members through the pointers

EDIT:

Let me clarify what I'm asking. Is the following use of "multi-level inheritance" guaranteed to work by the C standard?

C* c = malloc(sizeof(C));
// ... initialize fields in c

A* a = (A*) c;
// ... use A fields in a

B* b = (B*) a;
// ... use B fields in b

B* b = (B*) c;
// ... use B fields in b

c = (C*) a;
// ... go back to using C fields in c

It's unclear what your problem is. As long as the memory obtained my `malloc()` is large enough to fit an object of type `T`, (write-) accessing it through an lvalue expression of type `T` is well-defined, regardless of what the exact expression that computed the argument of `malloc()` was. — EOF, Apr 06 '20 at 12:37

John Bollinger · Accepted Answer · 2020-04-11T17:09:52.533

4

That the kind of "multi-level inheritance" you describe must work follows from the same principles -- explained in the other Q&A you referenced -- that makes this kind of inheritance work at all. Specifically, the standard explicitly provides that casting the addresses of structures and of their initial members between the applicable types has the desired effect:

A pointer to a structure object, suitably converted, points to its initial member [...] and vice versa.

(paragraph 6.7.2.1/15)

So consider this declaration, relative to the structure definitions provided:

C c;

The quoted provision specifies that &c == (C *) &c.base and (B *) &c == &c.base are both true.

But c.base is a B, so the provision also specifies that (A *) &c.base == &c.base.base and &c.base == (B *) &c.base.base are both true.

Since (B *) &c == &c.base is true and &c.base == (B *) &c.base.base are both true, it follows that (B *) &c == (B *) &c.base.base is also true.

Casting both sides to either A * or C * then produces also the equalities (A *) &c == &c.base.base and &c == (C *) &c.base.base.

This reasoning can be extended to an arbitrary nesting depth.

One can quibble a bit about dynamically allocated structures vis a vis the strict aliasing rule, but there's no reason to think that it is supposed to work any differently in that case, and as long as one first accesses the dynamically-allocated space via an lvalue of the most specific type (C in this example), I see no scenario that supports a different interpretation of the standard for the dynamic-allocation case than applies to other cases. In practice, I do not expect initial access via the most specific type actually to be required by any implementation.

edited Apr 11 '20 at 17:09

answered Apr 11 '20 at 16:54

John Bollinger

160,171
8
81
157

Thanks for the detailed answer. You're actually touching a bit of a concerning point there: the strict aliasing rule. After you mentioned it I went reading about it. As I understand it (please correct me if I'm wrong), the rule basically says that you can never dereference a pointer of type X to an object of type Y, if type X is not compatible with type Y. Isn't that exactly what we're violating when we're doing e.g. `A* a = (A*) c;`? – Aviv Cohn Apr 14 '20 at 10:08
In this answer (https://stackoverflow.com/a/7005988/3284878) it says the standard also allows a pointer to point to an object of a different type, if the pointer is to *"... an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union) ..."*. Is that the exception that makes the "inheritance" technique legal? And if so, why do you mention the idea of the first accesses happening through the most specific type? Why, in theory, should it matter? I'm probably missing something. Thank you – Aviv Cohn Apr 14 '20 at 10:13
@AvivCohn, in the first place, no, there is no union involved here. That special exception to the strict aliasing rule does not apply here and *it is not needed* here. You must first appreciate the strict aliasing rule (that is, paragraph 6.5/7 of the specification) does not preclude modifying components of a composite object via pointers to the components' types. It cannot be interpreted to do so, because that would conflict with the semantics of arrays, which are defined almost exclusively in terms of pointers to the elements' type.... – John Bollinger Apr 14 '20 at 22:33
[cont] The case of nested structures such as yours is not much different. You can use a pointer to a structure member to access that member, regardless of the member's type, and paragraph 6.7.2.1/15, which I quoted in this answer, explains how you can get a pointer to the first member of a structure from a pointer to that structure, and *vice versa*. Such a pointer, to whichever type in the nest of structures, does point to an object of the pointer's referenced type, per explicit provision of the standard. That that object overlaps others of different types is not an issue. – John Bollinger Apr 14 '20 at 22:39
I see. So I guess that leaves the last question. I understand now that this is legal: `B* b = malloc(sizeof(B)); A* a = (A*) b; b = (B*) a;` A pointer to the structure is converted to its first member, and vice versa. *But* - what if we do the following? `A* a = malloc(sizeof(B)); B* b = (B*) a;`. In this case, the first object allocated, while having `sizeof(B)`, is first put in an lvalue `A*`. Only then it is somehow cast to the totally different type, `B*`. Do you think the standard guarantees this working? [........] – Aviv Cohn Apr 15 '20 at 08:50
[cont] And how does the standard decide the "type of an object", anyway? By looking at the lvalue of the first assignment? – Aviv Cohn Apr 15 '20 at 08:51
@AvivCohn, how the "effective type" of an object is determined is described in paragraph 6.5/6, immediately preceding the strict aliasing rule. The SAR is written in terms of this. For objects that have a declared type, that is their effective type. For others -- that is, allocated objects -- their effective type is based on the effective type of the data most recently written into them and / or the type of the lvalue by which those data were written. But you have to understand that changing the effective type of an allocated object is a largely abstract operation. [...] – John Bollinger Apr 15 '20 at 12:58
[cont] C objects do not contain type information. Changing the effective type of an allocated object is all about how that object is *interpreted*, not about how it is *represented* in memory. This is also the domain with which the SAR is concerned. In a nutshell, the SAR disclaims any meaning for interpreting an object in a way that is inconsistent with its (effective) type. Such an inconsistent interpretation is manifestly not what we're talking about here. – John Bollinger Apr 15 '20 at 13:03

Kaz · Answer 2 · 2020-04-11T18:54:49.420

What the ISO C standard requires to work is the following situation:

union U {
  struct X x;
  struct Y y;
  struct Z z;
  /* ... */
};

If the structures share some common initial sequence of members, then that initial sequence can be accessed through any of the members. For instance:

 struct X {
   /* common members, same as in Y and Z: */
   int type;
   unsigned flags;

   /* different members */
 };

If all the structures have type and flags in the same order and of the same types, then this is required to work:

union U u;
u.x.type = 42;  /* store through x.type */
foo(u.y.type);  /* access through y.type */

Other hacks of this type are not "blessed" by ISO C.

The situation you have there is a little different. It's question of whether, given a leading member of a structure, can we convert a pointer to the structure to that member's type and then use it. The simplest case is something like this:

struct S {
  int m;
};

Given an object struct S s, we can take the address of m using &s.m, obtaining an int * pointer. Equivalently, we can obtain the same pointer using (int *) &s.

ISO C does require that a structure has the same address as its first member; a pointer to the structure and a pointer to the first member have a different type, but point to the same address, and we can convert between them.

This isn't restricted by nesting levels. Given an a of this type:

struct A {
  struct B {
    struct C {
       int m;
    } c;
  } b
};

the address &a.b.c.m is still the same as the address &a. The pointer &a.b.c.m is the same as (int *) &a.

Hacks of exactly the type described by the OP definitely are blessed by standard C. The `union` approach also is ok, but it is not what the OP is asking about. — John Bollinger, Apr 11 '20 at 16:59

Is multiple-level "struct inheritance" guaranteed to work everywhere?

2 Answers2

Linked