2

Say I have two structs: object and widget:

struct object {
    int field;
    void *pointer;
};
struct widget {
    int field;
    void *pointer;
};

And a function:

void consume(struct object *obj)
{
    printf("(%i, %p)\n", obj->field, obj->pointer);
}

I'm aware that if I try and do:

struct widget wgt = {3, NULL};
consume(&wgt);

I would violate the strict aliasing rule, and thus have an undefined behaviour.

As far as I understand, the undefined behaviour results from the fact that the compiler may align the struct fields differently: that is, padding fields to align with address boundaries (but never changing fields order, since the order is guaranteed to be respected by the standard).

But what if the two structs are packed? Will they have the same memory layout? Or, in other words, does the above consume() still have an undefined behaviour (despite the persistent compiler warning)?

Note: I used struct __attribute__((__packed__)) object { ... }; for packing (GCC).

Barmar
  • 741,623
  • 53
  • 500
  • 612
Zakk
  • 1,935
  • 1
  • 6
  • 17
  • `packed` is not part of the C standard at all, so you need to be looking at the GCC documentation if any. – Nate Eldredge Jul 27 '22 at 15:17
  • 2
    " .. the undefined behaviour results from the fact that the compiler may align the struct fields differently" --> I would say the UB comes from changes to an object of one type of data assumes that other types of data do not change. - not layout issues. Zakk what is the issue code is trying to avoid? – chux - Reinstate Monica Jul 27 '22 at 15:17
  • Given the way struct declarations and separate compilation work, it just about has to be the case that identical structs are identical, but that's not the same as a language guarantee. (Do you want to add the language-lawyer tag here?) – Steve Summit Jul 27 '22 at 15:19
  • @chux-ReinstateMonica If I get you correctly, do you mean, for example, adding/deleting/modifying some `object`'s fields? – Zakk Jul 27 '22 at 15:21
  • The TL;DR: You're fine even without the packed because the structs are identical. It _may_ be a technical violation of some clause in the standard but it's a "safe" one. But, I infer from the names `object` and `widget` that you are trying to do a "generic"? Here's an answer of mine that may help: [Writing a 'generic' struct-print method in C](https://stackoverflow.com/a/65621483/5382650) – Craig Estey Jul 27 '22 at 15:21
  • @CraigEstey I do not intend to do a generic. The code above is just an example to illustrate my problem. Anyway, thank you for pointing out to your answer. I'll definitely look at it. – Zakk Jul 27 '22 at 15:24
  • @CraigEstey: Nate Eldridge’s [sample code below](https://stackoverflow.com/a/73140815/298225) proves it is not a “safe” violation. – Eric Postpischil Jul 27 '22 at 15:40
  • @Zakk [Yes](https://stackoverflow.com/questions/73140420/are-packed-identical-structs-guaranteed-to-have-the-same-memory-layout?noredirect=1#comment129176999_73140420). Step back. Why do you want to call `consume(&wgt);` with a pointer to the wrong type? – chux - Reinstate Monica Jul 27 '22 at 17:20
  • @chux-ReinstateMonica I don't. I just wanted to understand if packing has any impact on memory layout. I'm not using such a code in any real program/library. – Zakk Jul 27 '22 at 17:54
  • @Zakk Packing has impact on memory layout, yet the issue of passing the wrong type is not a layout issue. – chux - Reinstate Monica Jul 27 '22 at 20:15

2 Answers2

7

They will most likely have the same layout; that will be part of the compiler's ABI.

The relevant architecture and/or OS may have a standard ABI that may or may not include a specification for packed. But the compiler will have its own ABI to lay them out in a predictable fashion, although the algorithm may not be written down precisely anywhere except the compiler source code.

However, that does not mean your code is safe. The strict aliasing rule applies to pointers to different types, whether or not they have the same layout.

Here is an example that can be compiled with gcc -O2:

#include <stdio.h>

__attribute__((packed))
struct object {
    int field;
    void *pointer;
};

__attribute__((packed))
struct widget {
    int field;
    void *pointer;
};

struct widget *some_widget;

__attribute__((noipa)) // prevent inlining which hides the bug
void consume(struct object *obj) 
{
    some_widget->field = 42;
    int val = obj->field;
    printf("%i\n", val);
}

int main(void) {
    struct widget wgt = {3, NULL};
    some_widget = &wgt;
    consume((struct object *)&wgt);
}

Try on godbolt

You are probably expecting this code to print 42, because some_widget and obj both point to wgt and thus val = obj->field should read the same int that was written by some_widget->field = 42. But in fact it prints 3. The compiler is allowed to assume that obj and some_widget do not alias, as they have different types; so the write and the read are considered independent and may be reordered.

On the level of the standard, you are accessing the object wgt, whose effective type is struct widget, through the lvalue *some_widget whose type is struct object. These types are not compatible because they have different tags (widget vs object), and so the behavior is undefined.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • What if I had `void consume(void *typeless) { struct object *obj = typeless; printf("(%i, %p)\n", obj->field, obj->pointer); }` and then `consume(&wgt)`? Would the problem persist? I don't know if this qualifies to be asked separately, but for now I post it as a comment. – Zakk Jul 27 '22 at 22:15
  • @Zakk: Yes, the problem persists. You are still accessing `wgt` through a pointer, namely `obj`, to type `struct object`. It doesn't matter how many intermediate casts were involved in getting it there. It is easy to [check](https://godbolt.org/z/h4KEMdeWK) that gcc breaks that code just like the previous example. – Nate Eldredge Jul 28 '22 at 06:04
  • @Zakk FYI: about `__attribute__((noipa))`: here is well known [GCC bug related to aliasing](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502). Have a look (in case of time). – pmor Aug 01 '22 at 23:43
  • @pmor Thanks for pointing this out. The bug is still in "suspended" status til now. Quite an old one. I'm curious why they didn't fix it. – Zakk Aug 02 '22 at 09:35
  • @Zakk From [here](https://stefansf.de/post/pointers-are-more-abstract-than-you-might-expect/): "The finding has been reported in 2014 as bug #61502 and so far the GCC people _argue that this is not a bug_ and therefore won’t fix it." – pmor Aug 04 '22 at 01:51
5

“As far as I understand, the undefined behaviour results from the fact that the compiler may align the struct fields differently…

No, it does not (solely). Even if two structures have identical member definitions, they are different types. Consider two types:

struct ComplexNumber  { double real, imag; };
struct GeometricPoint { double x, y;       };

which might be passed to some routine:

double foo(ComplexNumber *c, GeometricPoint *p)
…

Inside the function, code might assign some value to *p and use the value of *c, or vice-versa. Because these are different and incompatible types, the compiler is allowed to assume that they are not aliases for the same memory. That means, when optimizing, it can assume that assigning a value to *p will not change the value of *c, which the compiler might already be holding in registers from a previous use. Therefore, it does not need to reload the registers in case assigning to *p changed *c.

Thus the aliasing rule grants compilers license for this and similar behaviors and means that, if you violate the rule, the behavior is not defined, even if the structures have identical layouts.

Note: I used struct __attribute__((__packed__)) object { ... }; for packing (GCC).

Packing structures is a GCC extension. Because of its specification of the extension, you can expect that identically defined packed structures will have identical memory layouts. However, the aliasing rules of the C standard still apply. GCC has a switch to turn off the requirements of the aliasing rule, -fno-strict-aliasing.

If you know two objects have identical layout and want to use one as the other without violating the aliasing rule, you can do this by:

  • Copying the bytes of one into the other, as with memcpy(p, c, sizeof *p);.
  • Defining a union containing both types, initializing it with one type, and accessing the member of the other type. (This is defined by the C standard but not by the C++ standard.)
Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • @Zakk FYI: To my knowledge, Linux kernel is built using `-fno-strict-aliasing` (along with `-fno-delete-null-pointer-checks` and with `-fno-strict-overflow`). Perhaps the Linux code can be changed/revised/fixed, so these `no-` can be removed leading to perf. increase. – pmor Aug 01 '22 at 23:54