21

It is a common strategy in C to cast one type to another type, relying on the fact that the layout of a C struct has certain guarantees. Libraries such as GLib rely on this to implement object-oriented like inheritance. Basically:

struct Base
{
  int x;
  int y;
};

struct Derived
{
  struct Base b;
  int z;
};

This enables a Base* pointer to be assigned to the address of a Derived object.

But I'm also aware of the "strict aliasing" rule, which is the implicit assumption by the compiler that different-type pointers can't point to the same address. (This enables the compiler to perform certain optimizations.)

So, how are these two things reconciled? Many C libraries, include Glib, CPython, etc., use the above strategy to cast between types. Are they all simply compiling with flags like no-strict-aliasing?

Community
  • 1
  • 1
Channel72
  • 24,139
  • 32
  • 108
  • 180

1 Answers1

21

There's no violation of strict aliasing in this case. struct Derived contains a struct Base. This sort of behaviour is explicitly allowed by the language standard. From C11 6.7.2.1 Structure and union specifiers, paragraph 15:

A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa.

Carl Norum
  • 219,201
  • 40
  • 422
  • 469
  • 3
    What if struct `Derived` didn't actually contain `struct Base`, but merely contained the same initial data members as `struct Base`. In other words, what if `struct Derived` was defined as `struct Derived { int x; int y; int z; };` – Channel72 Sep 25 '13 at 17:06
  • 3
    Then you might have problems; the compiler might choose to use different padding/packing characteristics, for example. – Carl Norum Sep 25 '13 at 17:07
  • 2
    @Channel72: C's type system is (mostly) nominal, not structural, so having the same members is not enough; however, there's an exception - the common initial sequence rule for structures contained in a union; note that accessing a structure with the 'wrong' type will still technically be undefined behaviour (due to violation of the effective typing rules) if the structures are not actually contained within such a union, even though the compiler cannot assume strict aliasing as soon as such a union is in scope – Christoph Sep 25 '13 at 18:44
  • 1
    @CarlNorum: if you do not manually fiddle with the padding via pragmas, it won't be a problem as there could be a union containing both structures in a different translation unit, thus the compiler always has to lay out common initial sequences identically – Christoph Sep 25 '13 at 18:50
  • If there's a union containing both structures, there's no reason the fields in those structures would need to be aligned. There has to be a one-to-one field correspondence (and the tag has to be the same, if there is one) for two structure types to be compatible types. (6.2.7 paragraph 1). – Carl Norum Sep 25 '13 at 18:55
  • @CarlNorum: being contained in the same union of course won't make the types compatible, but they *will* be aligned and may be accessed via both types if the actual object is part of the union - see 6.5.2.3 §6 and §9 (example 3) – Christoph Sep 25 '13 at 19:01
  • 1
    Note that 6.5p7 (the infamous "strict aliasing" ruleset) also contains explicit language intended to permit this kind of code: "An object shall have its stored value accessed only by an lvalue expression that has one of the following types: ... an aggregate or union type that includes one of the aforementioned types among its members." (In contrast, 6.5p7 does _not_ include any exception for common initial subsequences, with or without a union, and some compilers do break code that assumes e.g. `sockaddr.sa_family` can alias `sockaddr_in.sin_family`.) – zwol Feb 13 '17 at 18:20
  • What will happen if I call some type conversion macros of C library from a C++ project? – Zz Tux Jan 15 '21 at 15:15