Here's my test program in Compiler Explorer using clang.
And here is the same program in OnlineGDB using gcc.
The sizeof(_16Bytes)
is 16 as expected, however offsetof(Test, test)
is 12 because the compiler decided to pack it right after _16Bytes::_4bytes
.
This actually makes perfect sense to me, based on standard packing rules, assuming that struct Test : _16Bytes
turns into the equivalent of this:
struct Test
{
uint64_t _8bytes; // inherited from `struct _8Bytes`
uint32_t _4bytes; // inherited from `struct _16Bytes`
uint8_t test; // directly part of `struct Test`
};
That's because this struct is naturally packed since it is arranged in order of largest to smallest data type. The uint64_t
requires 8-byte alignment, and already has it, the uint32_t
requires 4-byte alignment, and already has it, and the uint8_t
requires 1-byte alignment, and already has it. Therefore, the only padding required gets added to the very end, like this:
struct Test
{
uint64_t _8bytes;
uint32_t _4bytes;
uint8_t test;
// 3 bytes of padding to force 8-byte alignment of the whole struct
}; // struct is 16 bytes total
So, I expect sizeof(_16Bytes)
to be 16, and offsetof(Test, test)
to be 12.
However, it is a faulty assumption to assume that struct Test : _16Bytes
turns into the struct above. Actually, this is undefined behavior apparently when the struct inherits from another class or struct. clang shows this invalid-offsetof
warning on Compiler Explorer:
Output of x86-64 clang 13.0.0 (Compiler #1):
<source>:61:44: warning: offset of on non-standard-layout type 'struct Test' [-Winvalid-offsetof]
printf("offsetof(Test, test) = %lu\n", offsetof(struct Test, test));
...and gcc shows this warning for the same line on OnlineGDB:
main.cpp:61:53: warning: offsetof within non-standard-layout type ‘Test’ is undefined [-Winvalid-offsetof]
printf("offsetof(Test, test) = %lu\n", offsetof(struct Test, test));
The gcc
output makes it more clear: "offsetof within non-standard-layout type ‘Test’ is undefined".
Note: clang, by design, tries to be gcc-compatible. See here: https://clang.llvm.org/ --> End User Features --> "GCC compatibility."
You said:
then all the sudden offsetof(Test, test)
is 16. This makes absolutely zero sense to me - can somebody explain what is going on?
Therefore, when you make the change to struct _16Bytes
and see that offsetof(Test, test)
becomes a really weird and anomalous value of 16, that is also undefined behavior, and therefore has no guaranteed nor predictable behavior we can analyze except for looking at the specifics of the clang compiler, which is pointless, since it is undefined behavior by the standard and could change at any moment anyway. So, you must avoid the undefined behavior I think and not use inheritance if you desire to read an offset.
This is extremely annoying and I would like to disable this behavior in the first place, however so be it.
More importantly, is there a way to disable this annoying packing behavior?
You have to manually pack your structs as you see fit--however you'd like them to be. I'm not sure what you'd like. Do you want the uint8_t
to have an offset of 15 instead of 12? If so, do this:
Ex:
struct Test4
{
uint64_t _8bytes;
uint32_t _4bytes;
uint8_t padding[3]; // explicitly place 3 bytes of padding
uint8_t test;
}; // struct is 16 bytes total
Now, sizeof(Test4)
is 16, and offsetof(Test4, test)
is 15 instead of 12.
Note that depending on what you are trying to accomplish, you may need to forcefully remove all automatic padding by adding __attribute__ ((__packed__))
just after the word struct
and just before the struct name.
Example: this very non-standard padding, in conjunction with the packed attribute, allows struct __attribute__ ((__packed__)) Test5
to have a size of 16 and offsetof(Test5, test)
to still be 15:
struct __attribute__ ((__packed__)) Test5
{
uint8_t padding[3]; // explicitly place 3 bytes of padding
uint64_t _8bytes;
uint32_t _4bytes;
uint8_t test;
};
Also take a look at the alignas()
specifier in C++ to see if it can be used to create your desired effect. And, look into #pragma pack
.
Keep in mind you can also achieve certain desired results by including a struct as a member of another struct directly, rather than by using inheritance as you have done. Including one struct in another can avoid the undefined behavior, while doing the inheritance and expecting certain offset results invokes the undefined behavior.
Lastly, you can consider writing Python scripts to autogenerate C++ code for you which generates any necessary struct definitions and handles padding/packing/alignment, and serialization concerns for you. You can define packets in YAML (preferred, in my opinion) or JSON files. This is pretty common practice I think--using Python to autogenerate C or C++ for you. I show how to import yaml files in Python here. However, avoid autogenerating C or C++ using Python if possible, as it may end up creating more code complexity, complicated abstraction, and confusion for fellow developers in an attempt to create less. But, that's for you to decide based on your total situation, use-case, and architecture.
Here is my final and full test code:
https://onlinegdb.com/nfp19v8m3
/******************************************************************************
Welcome to GDB Online.
GDB online is an online compiler and debugger tool for C, C++, Python, Java, PHP, Ruby, Perl,
C#, VB, Swift, Pascal, Fortran, Haskell, Objective-C, Assembly, HTML, CSS, JS, SQLite, Prolog.
Code, Compile, Run and Debug online from anywhere in world.
GS
15 Jan. 2022
See: https://stackoverflow.com/questions/70727668/bizarre-struct-member-packing-in-32-bit-clang
*******************************************************************************/
#include <iostream>
struct _8Bytes
{
uint64_t _8bytes;
};
struct _16Bytes : _8Bytes
{
uint32_t _4bytes;
};
struct _16Bytes2
{
uint64_t _8bytes;
uint32_t _4bytes;
};
struct Test : _16Bytes
{
uint8_t test;
};
struct Test2
{
uint64_t _8bytes;
uint32_t _4bytes;
uint8_t test;
};
struct Test3 : _16Bytes2
{
uint8_t test;
};
struct Test4
{
uint64_t _8bytes;
uint32_t _4bytes;
uint8_t padding[3]; // explicitly place 3 bytes of padding
uint8_t test;
};
struct __attribute__ ((__packed__)) Test5
{
uint8_t padding[3]; // explicitly place 3 bytes of padding
uint64_t _8bytes;
uint32_t _4bytes;
uint8_t test;
};
int main()
{
printf("sizeof(_8Bytes) = %lu\n", sizeof(_8Bytes));
printf("sizeof(_16Bytes) = %lu\n", sizeof(_16Bytes));
printf("sizeof(_16Bytes2) = %lu\n", sizeof(_16Bytes2));
printf("sizeof(Test) = %lu\n", sizeof(Test));
printf("sizeof(Test2) = %lu\n", sizeof(Test2));
printf("sizeof(Test3) = %lu\n", sizeof(Test3));
printf("sizeof(Test4) = %lu\n", sizeof(Test4));
printf("sizeof(Test5) = %lu\n", sizeof(Test5));
printf("\n");
printf("offsetof(Test, test) = %lu\n", offsetof(Test, test));
printf("offsetof(Test2, test) = %lu\n", offsetof(Test2, test));
printf("offsetof(Test3, test) = %lu\n", offsetof(Test3, test));
printf("offsetof(Test4, test) = %lu\n", offsetof(Test4, test));
printf("offsetof(Test5, test) = %lu\n", offsetof(Test5, test));
return 0;
}
References:
- https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html - official gcc documentation for
__attribute__ ((__packed__))
, which clang also supports. Search the page for "packed".
- Default inheritance access specifier - I have never seen inheritance done without specifying the access specifier (
public
, protected
, or private
). I had to find that here.
See also:
- https://en.cppreference.com/w/cpp/language/alignof
- https://en.cppreference.com/w/cpp/language/alignas
- *****What is the difference between "#pragma pack" and "__attribute__((aligned))" - in short,
#pragma pack(1) // set packing AND alignment to 1
// place struct definition here
#pragma pack() // unset packing AND alignment
type syntax is more-restrictive than gcc attribute syntax, and is essentially equivalent to __attribute__((packed,aligned(1)))
, which is NOT necessarily what you want, since you probably want the struct packed to 1-byte but NOT aligned to 1-byte!
- See also: Anybody who writes
#pragma pack(1)
may as well just wear a sign on their forehead that says “I hate RISC” <-- DON'T BE THAT PERSON! So, just use __attribute__ ((__packed__))
instead of #pragma pack(1)
!