2

What is the correct way to access packed struct's members?

struct __attribute__ ((packed)) MyData {
    char ch;
    int i;
}
void g(int *x); // do something with x

void foo(MyData m) {
    g(&m.i);
}

void bar(MyData m) {
    int x = m.i;
    g(&x);
}

My IDE gives warning/suggestion for foo that I might be accessing misaligned int pointer, which is indeed the case here. My questions are

  • Between foo and bar, is one approach better than the other?
  • Is it incorrect to access misaligned pointer data but okay to use it to initialize a properly aligned type? (as in bar).
  • Should we copy packed struct individual members to properly aligned data structure and then use it? That would imply that for almost every packed data structure there is a non-packed data structure and packed structure remains confined to serialization layer.
Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
DDG
  • 147
  • 1
  • 6
  • The compiler may use all sorts of fancy tricks to implement the `int x = m.i;` line. Maybe it only accesses the `i` member as an array of `char`, and builds the `int` from that? – Adrian Mole Feb 04 '22 at 00:54
  • yes, compiler can certainly use memcpy here which is the right thing to do. But my question is different. – DDG Feb 04 '22 at 00:59
  • I don't really get your last bullet point. Why would you want to create a whole new (non-packed) structure, just to access one member of the packed version? – Adrian Mole Feb 04 '22 at 01:19
  • Using one member is just an example. In case a struct has 10 members, then the issue of misaligned access can arise for any member for a struct. – DDG Feb 04 '22 at 04:23

2 Answers2

3

Currently, calling g may assume that x is aligned. This will probably be fine on x86 architectures, but foo might crash on ARM.

Calling it like in bar is not much better than g taking int x. However, it is correct, since the compiler knows that m.i is misaligned, so can generate the code to copy a misaligned int. This does mean that the pointer can't modify the original object (unless you reassign it).

You can also use the type of a misaligned integer:

typedef int __attribute__((aligned(1))) packed_int;
void g(packed_int * x); // do something with x

This can be called directly as g(&m.i). Be warned that it cannot perform aligned access leading to slowdowns on some platforms.

Artyer
  • 31,034
  • 3
  • 47
  • 75
2

Is it incorrect to access misaligned pointer data but okay to use it to initialize a properly aligned type? (as in bar).

As far as the C++ language is concerned, there is no such thing as a packed class nor such thing as improperly aligned object. Hence, an improperly aligned pointer would necessarily be invalid.

Whether your compiler that provides a language extension for packed classes also extends the language to allow access through misaligned pointers is up for your compiler vendor to document. The warning implies that latter extension might not be supported.

Between foo and bar, is one approach better than the other?

bar, as per the warning.

Should we copy packed struct individual members to properly aligned data structure and then use it? That would imply that for almost every packed data structure there is a non-packed data structure and packed structure remains confined to serialization layer.

That could be a convenient solution to confine the non-standard packed classes into the serialisation layer.

Note that this isn't the only problem with packed structs. Another problem is portability of serialised data between systems due to different byte orders and sizes of types.

A portable way to serialise data is to not use packed structs at all, but rather shift bytes individually using explicit offsets.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • So having two data structure (one packed and one non-packed) to represent same entity is common? I know we can achieve serialization by using one data structure only and serialize and deserialize individual members but that would be inefficient compared to memcpy call that we can make with POD packed struct. – DDG Feb 08 '22 at 13:35
  • 1
    @DDG Accessing the unaligned members is inefficient too; you've just transferred the inefficiency elsewhere. That's potentially worse if the members are accessed many times per object. Besides, most serialisation is for purpose of file writing or network access, which have much more overhead compared to a few CPU instructions. – eerorika Feb 08 '22 at 17:07