14

I have two structs like

struct X {
  int x;
  double *y;
};

struct Y {
  int a;
  double *b;
  char c;
};

Is casting a pointer to struct Y to a pointer to struct X guaranteed to behave consistently in a reasonable manner (i.e. x->x and x->y correspond to y->a and y->b respectively) by the C89 standard? Less important, but would be really cool if you also happen to know, does this hold true for later standards as well (e.g. C11) and other languages that have significant syntactic and semantic overlap with C (e.g. C++XX, Objective-C)?

math4tots
  • 8,540
  • 14
  • 58
  • 95
  • I don't know about C++, but it seems true for all versions of C. Many software packages rely on that. – chqrlie Oct 26 '15 at 20:03
  • 2
    C++ is a different language, not a later version of C – M.M Oct 26 '15 at 20:09
  • @M.M Fair point. Though, I think it's natural to be curious about this behavior in other languages that have a lot of overlap with C. Will edit to make this clearer – math4tots Oct 26 '15 at 20:20
  • 1
    As in C++ a `struct` is actually a `class`, behaviour might be very different. And you would not use such anyway, but use inheritance. If you write C-style code in C++, you are doing something wrong. Much like driving a hummer in Naples or Rome. As @M.M wrote, C and C++ are different languages. Just the same syntax does not imply the same semantics. And asking about all "similar" languages would make the question too broad. – too honest for this site Oct 26 '15 at 20:24
  • @Olaf I agree that asking about all similar languages do make it seem too broad. If you want to edit that part of the question, I won't get in your way. Though I do want to say that I don't think it's fair to discount the relationship that C has with C++ and Objective-C. The C++ standard even goes out of its way to distinguish aggregate types that are very C-struct like as opposed to more featureful C++ classes. – math4tots Oct 26 '15 at 20:46
  • Which types would that be, if not `struct`, which is the sama as class (exept for the default visibility)? Anyway, read my comment completely! – too honest for this site Oct 26 '15 at 23:09
  • @Olaf The C++ standard has some funny rules for classes based on what some would argue, an ugly set of criteria: http://stackoverflow.com/questions/4178175/what-are-aggregates-and-pods-and-how-why-are-they-special From what I understand, many of the semantics of C++ are twisted and ugly to accommodate a lot of legacy C code. So granted, the semantics are not identical, but they have some purposefully designed deep ties. Also on writing C-style code in C++... – math4tots Oct 26 '15 at 23:39
  • @Olaf I agree with you that writing C-style code when C++ is available is rather silly. However, I think it can be very useful to understand the limitations of the C++ language, especially wrt its source compatibility with C. E.g. if you are working in Visual Studio, Microsoft's C-compiler is really dated; it's so old you can't even mix declarations with expressions. But the C++ is pretty good. So what if you need to port a mostly C codebase in Visual Studio that requires a better C compiler? This sort of knowledge about what sort of C is allowed in C++ can be very useful... – math4tots Oct 26 '15 at 23:45
  • @Olaf I agree with you and M.M that C and C++ are different languages. I also agree with you that we should not write C-style code in new C++ code. But for better or for worse, source compatibility with C is a major feature of C++. And this feature can come in really handy at times -- as such I think inquiries about when certain obscure C features are also available in C++ can often be worthwhile... – math4tots Oct 26 '15 at 23:55
  • @Olaf I think that covers all my reaction to all of your original comment... in retrospect I think this might've been better through pm ^_^; You had a lot of good points and I had a lot to say about them too, but I wasn't sure how much was an appropriate amount of elaboration... Feel free to message me if you feel like I missed something. :) – math4tots Oct 27 '15 at 00:03
  • "So what if you need to port a mostly C codebase in Visual Studio that requires a better C compiler?" Use gcc or clang. However,this is no discussion forum and I'm not up to a religious(sic!) discussion. There are enough readings about the differences between the two languages and I will not repeat them. I recommend you delete these OT comments. – too honest for this site Oct 27 '15 at 00:24
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/93428/discussion-between-math4tots-and-olaf). – math4tots Oct 27 '15 at 00:36

2 Answers2

7

It's undefined behavior. Python relied on this before and had to fix that. If you have struct Y include struct X as its first element, you can use that for a similar effect:

struct Y {
    struct X a;
    char c;
};

struct Y y;
struct X *x = (struct X *) &y; /* legal! */

There's also a special case for unions:

struct X {
  int x;
  double *y;
};

struct Y {
  int a;
  double *b;
  char c;
};

union XY {
    struct X x;
    struct Y y;
};

union XY xy;
xy.x.x = 0;
printf("%d\n", xy.y.a); /* legal! */

In later versions of the C standard, the compiler is only required to handle X and Y objects aliasing each other if a union definition like this is actually in scope, but in C89, it mostly has to assume such a definition exists somewhere. That still doesn't make it safe to cast a struct Y * to a struct X *, though; if the compiler knows that a specific struct Y is not part of a union, it may still assume that a struct X * can't possibly alias it.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Cool thank you this is perfect! I was expecting that my ideal answer would reference the C standard, but to be honest I trust the Python guys to really know their thing, and if they say it's undefined behavior in an official PEP, I'm pretty inclined to trust them. – math4tots Oct 26 '15 at 20:25
  • The standard declares two structs "compatible" if they have the same members in the same order (but indeed requests same names, too ... overlooked that at first) ... still doubt *this* interpretation. –  Oct 26 '15 at 20:26
  • @FelixPalmen: For `struct X` and `struct Y` to be compatible, `struct X` would need to have that `char` member at the end. – user2357112 Oct 26 '15 at 20:32
  • @user2357112 of course, but 6.5p7 allows access through a pointer where only members are compatible. But nevertheless, your interpretation is the one leaving no doubts at all, so I updated my answer. –  Oct 26 '15 at 20:35
  • @FelixPalmen, whether the member names are the same doesn't even matter here. The two struct types in question are not compatible because they have different numbers of members. – John Bollinger Oct 26 '15 at 20:39
  • @JohnBollinger that's not my point, I only say `struct X` is compatible with the first members of `struct Y`. But good point in your answer hinting about alignment, so it's really only the first member you can take as "safe" ... –  Oct 26 '15 at 20:41
  • Aliasing works based on the type of the lvalue doing the aliasing and the type of the object being aliased, so `struct X x; ((struct Y *)&x)->a = 5;` is not an aliasing violation because it is OK to alias `int` as `int`. However it would not be OK to write `*(struct Y *)&x = y;` – M.M Oct 26 '15 at 20:44
  • @FelixPalmen, C has no concept of one type being compatible with only a portion of another, except in the trivial sense of one type being compatible with the type of a single member of a `struct` or `union`. In that sense, your suggestion to embed a `struct X` inside `struct Y` is a good one. – John Bollinger Oct 26 '15 at 20:46
  • The problem addressed by the Python article is more of an aliasing issue than a layout issue. Accessing the same object via pointers to 2 different incompatible object types poses a problem as the compiler is allowed to assume that the 2 pointers cannot point to the same object, except for a few exceptional cases. This is not exactly what the OP is asking, but it is a good hint at the kind of hard to find bugs he might face. hard bordering on impossible. – chqrlie Oct 26 '15 at 21:10
  • "In later versions of the C standard" why don't you just say "Since C99"? – Z boson Oct 27 '15 at 12:10
2

C89 permits the conversion, by cast, of a pointer to one object type to a pointer to a different object type.

Before we even get to dereferencing, however, the converted pointer is guaranteed to be valid at all only if the referenced type of the original pointer has required alignment at least as strict as that of the type referenced by the result pointer. Alignment is entirely implementation-dependent. In your particular example, it is likely, but not guaranteed, that the two struct types will have the same alignment requirement.

Supposing that the conversion does produce a valid pointer, the standard does not define precisely how struct members are laid out inside the representation of a struct object. Members must appear in the same order as in the struct definition, and there must not be any padding before the first one, but no other details are defined. In your example, therefore, it is guaranteed that X->x will correspond to Y->a (supposing, again, that the converted pointer is valid in the first place), but undefined whether X->y will correspond to Y->b.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • How can adding a `char` member possibly increase the struct alignment requirement? The committee should give up trying to support the DS9K. – chqrlie Oct 26 '15 at 20:59
  • @chqrlie, it's not about whether adding a `char` member *should* increase the alignment requirement, it's about whether it is *permitted* to do so. It is. Nevertheless, as a not-too-contrived example, consider a system that chooses to align `struct`s with 8 bytes worth of members on 8-byte boundaries, but to align larger `struct`s on 16-byte boundaries. If such a system runs on a machine on which `int` and `double *` are both 32 bits wide, then exactly the OPs example would present a different alignment requirement for his two `struct` types. – John Bollinger Oct 26 '15 at 22:13
  • For what it's worth, though, alignment is implementation-defined, not unspecified or undefined. That means you can rely on conforming implementations to specify, and if you are writing for a specific implementation then your code can rely on what that implementation specifies. – John Bollinger Oct 26 '15 at 22:18
  • The OP asks specifically whether a `struct Y` object can be accessed via a `struct X` pointer, not the other way around. The potentially increased alignment of Y objects should not pose a problem unless the architecture is really exotic, such as the DS9K of course, but older Crays had weird quirks too. – chqrlie Oct 26 '15 at 23:57
  • @chqrlie, either way around, C does not require that the objects have alignment requirements that make the conversion yield a valid pointer. That's anyway only half the question -- since C also doesn't require layouts to match up beyond the first element, there is no guarantee (from C) that what the OP wants to do will work. – John Bollinger Oct 27 '15 at 00:01
  • That's true, and it is a regrettable departure from the principle of least surprise. – chqrlie Oct 27 '15 at 00:16