0

Compiler: I'm personally using gcc, but the question is conceptual. I'm interested in options for any compiler.

Is there a way to tell the C compiler to make struct B have the same size as struct AB without sacrificing alignment?

It should also respect alignment when put into an array.

I've tried using __attribute__ ((__packed__, aligned(4))) but this seems to be the same as not using any attributes (the size is still rounded up to the alignment).

I don't understand how this isn't an obvious improvement: in certain cases it could save quite a bit of space for structs without sacrificing performance (or ergonomics) on field lookups. All it would require for the compiler is to store a (size, alignment) for each struct.

#include <stdio.h>
#include <stdint.h>

struct A { // total size: 6 bytes (actually 8)
  uint32_t a0; // 4 bytes
  uint16_t a1; // 2 bytes
};

struct B { // total size: 8 bytes (actually 12)
  struct A b0; // 6 bytes
  uint16_t b1; // 2 bytes
};

struct AB { // total size: 8 bytes (actually 8)
  uint32_t a0; // 4 bytes
  uint16_t a1; // 2 bytes
  uint16_t b1; // 2 bytes
};

// Kind of works, but sacrifices alignment
struct __attribute__ ((__packed__)) Ap {
  uint32_t a0; // 4 bytes
  uint8_t a1;  // 1 byte
};
struct __attribute__ ((__packed__)) Bp {
  struct Ap b0;
  uint16_t  b1;
};

int main() {
  printf("sizeof(A)  = %u\n", sizeof(struct A));  // 8  (not 6)
  printf("sizeof(B)  = %u\n", sizeof(struct B));  // 12 (not 8)
  printf("sizeof(AB) = %u\n", sizeof(struct AB)); // 8  (same as desired)
  printf("sizeof(Ap) = %u\n", sizeof(struct Ap)); // 5  (as desired)
  printf("sizeof(Bp) = %u\n", sizeof(struct Bp)); // 7  (not 8)
  return 0;
}

The way I've been actually doing this:

#define STRUCT_A  \
  uint32_t a0; \
  uint8_t a1

struct AB {
  STRUCT_A;    // 6 bytes
  uint16_t b1; // 2 bytes
};
Jose Manuel de Frutos
  • 1,040
  • 1
  • 7
  • 19
vitiral
  • 8,446
  • 8
  • 29
  • 43
  • 1
    This would be compiler-specific, so please add the tag for the compiler you are using. – Nate Eldredge Jun 11 '23 at 15:47
  • In this example, I guess you can make `Ap` packed and `Bp` not packed? Otherwise, the normal approach is to insert the padding yourself, as dummy members that you don't otherwise use. – Nate Eldredge Jun 11 '23 at 16:03
  • Keep in mind that the outer struct cannot place any of its other members within the padding of the inner struct, as a function modifying the inner struct through the pointer is allowed to overwrite the padding. So if `sizeof(struct A) == 8`, even though only 6 bytes are used, those 2 bytes are not available to be used for another member of `struct B`. – Nate Eldredge Jun 11 '23 at 16:05
  • @NateEldredge I added compiler question. – vitiral Jun 11 '23 at 16:11
  • For #2: I suppose you could... but then anywhere else you used `Ap` (like an array) you would need to be careful. Not ideal IMO. – vitiral Jun 11 '23 at 16:11
  • For #3: ouch, I actually hadn't realized that and it makes the question even more critical -- since I sometimes do convert `AB*` into `A*` in my code, so `*myA = {0};` could actually clear `b1`. – vitiral Jun 11 '23 at 16:11
  • you can do `struct B{ union { struct A b0; struct { unsigned char _[6]; uin16_t b1; } }; }; }` when you statically know that `actual_sizeof(struct A) == 6`. – KamilCuk Jun 11 '23 at 16:22
  • So then, I don't think you can have your cake and eat it too. If the inner struct has size less than 8, then making an array of them will require unaligned access. If it has size 8 or more, the size of the outer struct has to be at least 10, or 12 if unaligned accesses are to be avoided there. – Nate Eldredge Jun 11 '23 at 16:23
  • @KamilCuk: But then we have the same problem that modifying `b0` may unintentionally modify `b1`. And in your example, it's more obvious that trying to use both `b0` and `b1` violates the active member rule. – Nate Eldredge Jun 11 '23 at 16:25
  • You could do packed with explicit alignment to each member. `struct __attribute__ ((__packed__)) Bp { struct Ap b0; alignas(uint16_t) uint16_t b1; };` – KamilCuk Jun 11 '23 at 16:29
  • 2
    This seems like an [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). What **problem** do you think packing structures will solve? – Andrew Henle Jun 11 '23 at 20:18
  • When exact memory layouts matter, like when modelling hardware registers or data protocols, structs simply aren't ideal. You'll either end up with non-portable code, or with structs that have to be manually serialized/deserialized. – Lundin Jun 12 '23 at 09:41
  • @NateEldredge "If the inner struct has size less than 8, then making an array of them will require unaligned access" Imagine a struct of 3 uint16_t -- it has size 6 (rounded to 8) but it's alignment requirement is only 2. Putting them next to eachother in an array would be fine. Putting a U2 after them in a struct (like I did) would also be fine. – vitiral Jun 12 '23 at 17:42
  • @AndrewHenle I'm creating a low-level language and choosing the memory layout for my "C like" structs and don't see why C is throwing away memory for no benefit! I'd like to be able to tell users how to make C behave better. – vitiral Jun 12 '23 at 17:42
  • @KamilCuk that's exactly what I was looking for! Thanks – vitiral Jun 12 '23 at 17:58
  • @Lundin I'd like the language I'm building to not be throwing away bytes for no reason but still be able to tell folks (or auto-generate) the struct definitions in C. – vitiral Jun 12 '23 at 18:02
  • @vitiral *... and don't see why C is throwing away memory for no benefit* Why do you think it's "for no benefit"? – Andrew Henle Jun 12 '23 at 18:52
  • Well, it uses more memory and doesn't help aligned access of any sub-fields. You can specify an identical "unrolled" struct which saves bytes and is equally performant and safe. What is the benefit? – vitiral Jun 12 '23 at 19:10
  • @vitiral *and is equally performant and safe*? Oh? [You sure about that](https://stackoverflow.com/questions/46790550/c-undefined-behavior-strict-aliasing-rule-or-incorrect-alignment/46790815)? Make sure you read [this comment](https://stackoverflow.com/questions/46790550/c-undefined-behavior-strict-aliasing-rule-or-incorrect-alignment/46790815#comment118827473_57326681) So no, it's **not** "equally performant" nor is it "equally ... safe". – Andrew Henle Jun 12 '23 at 19:49
  • 2
    @vitiral "The language" isn't throwing bytes away and it isn't for no reason. The specific compiler port is throwing bytes away and the good reason why is alignment, required by the CPU. As someone already explained, a sub struct inside another sub struct must be accessible stand-alone or you can't really use it as a C programmer would expect. One option is to work with byte streams and memcpy. Or arrays - instead of having one array of structs, have one array of `uint32_t`, and some other arrays with `uint16_t` and let the index be what ties them together. Common practice in embedded systems. – Lundin Jun 12 '23 at 20:43
  • @vitiral: If you have three `uint16_t` then its `sizeof` will actually be 6 and your problem wouldn't exist. I was talking about the actual struct in your example which has a `uint32_t` member. The rounding up of `struct` size isn't indiscriminate - it's rounded up to a multiple of the required alignment, which would normally be the alignment of the most restrictive member. – Nate Eldredge Jun 12 '23 at 20:55
  • 1
    @vitiral: The language can't express the idea of "size 6 and alignment 4" because it is a guarantee that size is a multiple of alignment. It's not just a matter of the compiler being stupid, it affects code too. If `sizeof(T)` is `n`, then it is a promise of the language that successive elements of an array of `T` are at intervals of exactly `n` bytes. You need this to correctly implement something like `qsort`. And so if that is to hold, then an object that requires `k` byte alignment must necessarily have a `sizeof` that is a multiple of `k`. – Nate Eldredge Jun 12 '23 at 21:01
  • @vitiral Also, read the response of the GCC developers to a bug requesting GCC support the unaligned accesses you characterized as "equally performant and safe": https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93031 Especially this one: "we can't really make all data types unaligned since that would break the ABI". And note they're discussing x86 systems here. – Andrew Henle Jun 12 '23 at 21:14
  • @AndrewHenle, Lundin my wish/suggestion does not cause un-aligned access, that's the whole request -- ask the compiler to lay out memory to NOT throw away alignment while saving memory. For demonstration AB and B have the same "alignment safety" but B takes up 4 extra bytes. It DOES violate (or require change to) the rules that NateEldredge suggested regarding how arrays are laid out (see my comment to https://stackoverflow.com/a/76460498/1036670) – vitiral Jun 13 '23 at 17:41
  • @AndrewHenle I don't see how [this comment](https://stackoverflow.com/questions/46790550/c-undefined-behavior-strict-aliasing-rule-or-incorrect-alignment/46790815#comment118827473_57326681) is relevant, since the question is for the C compiler to layout structs tighter _without harming alignment_. – vitiral Jun 13 '23 at 17:47
  • @vitiral You've posted things like " I'd like to be able to tell users how to make C behave better." and "You can specify an identical "unrolled" struct which saves bytes and is equally performant and safe" Both of those statements are **wrong**. That comment demonstrates that dropping alignment requirements is not "equally performant", and the responses of the GCC maintainers to request to drop alignment requirements - "break the ABI" demonstrate that it's also not safe. Your entire premise that you can safely drop alignment requirements without performance impacts is just wrong, full stop – Andrew Henle Jun 13 '23 at 18:14
  • (cont) What happens when your implementation assumes an `int32_t` access is atomic, but that `int32_t` object spans a page boundary in virtual memory? The bytes of that simple `int32_t` object are going to be in wildly disparate, non-contiguous locations of physical memory with no way for any type of atomic access. And that's for something simple that supposedly has zero alignment requirements. – Andrew Henle Jun 13 '23 at 18:18
  • @AndrewHenle again I'm not recommending dropping alignment requirements. I AM recommending changing how array index size is calculated. I'm simply propsing that C be able to lay out `B` in the same way it lays out `AB`. `AB` _doesn't drop alignment requirements_, contains the same data as `B` but is 4 bytes less. I find this wasteful. If `AB` is safe (which it is... it's standard C) then my proposal should be safe as long as it's accompanied by a new way to calculate index size. – vitiral Jun 13 '23 at 18:35
  • @vitiral *I AM recommending changing how array index size is calculated* Which can't be done without ignoring alignment requirements. *AB doesn't drop alignment requirements* Your incorrect claim does not make it true. *If AB is safe (which it is... it's standard C)* That's flat out wrong. It's not "standard C". Packed structures are an **extension** that is not strictly-conforming C. Read what the GCC developers said - it breaks the X86 ABI. Your limited experience with x86's lenient alignment enforcement seems to have caused you to conclude alignment is optional. It's not. – Andrew Henle Jun 13 '23 at 19:13
  • @vitiral [Read the x86-64 ABI](https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf): "Structures and unions assume the alignment of their most strictly aligned component. Each member is assigned to the lowest available offset with the appropriate alignment. **The size of any object is always a multiple of the object‘s alignment**" – Andrew Henle Jun 13 '23 at 19:49
  • @AndrewHenle "AB... It's not "standard C" please see the definition of `struct AB` above... its a regular C struct, not a packed one. I understand that this feature would violate the current ABI (it would be an extension). It would still have exactly the same performance impact of converting all uses of `B` to `AB` and handling any `*A = foo` using `sizeof(A)==6`, something rather annoying to do w/out compiler support. – vitiral Jun 13 '23 at 20:47

2 Answers2

2

If I correctly understand what you're wishing for, it's impossible. It's not merely a compiler or ABI restriction; it would actually be inconsistent with the following fundamental principles of the C language.

1. In an array of type T, successive elements are at intervals of sizeof(T) bytes.

This guarantee is what allows you to correctly implement "generic" array processing functions like qsort. If for instance we want a function that copies element 3 of an array to element 4, then the language promises that the following must work:

void copy_3_to_4(void *arr, size_t elem_size) {
    unsigned char *c_arr = arr; // convenience to minimize casting
    for (size_t i = 0; i < elem_size; i++) {
        c_arr[4*elem_size + i] = c_arr[3*elem_size+i];
    }
}

struct foo { ... };
struct foo my_array[100];
copy_3_to_4(my_array, sizeof(struct foo)); // equivalent to my_array[4] = my_array[3]

From this it follows that if an object T has a required alignment of k bytes, then sizeof(T) must necessarily be a multiple of k. Otherwise, the elements of a large enough array could not all be correctly aligned. So your proposed notion of an object of size 6 and alignment 4 cannot be consistent with this principle.

So for the struct A in your example, with a uint32_t and a uint16_t member: if we suppose that, as on most common platforms, uint32_t requires 4-byte alignment, then struct A requires the same, and so sizeof(struct A) can't be 6; it has to be 8. (Or, in principle, 12, 16, etc, but that would be weird.) The 2 bytes of padding is unavoidable.

2. Distinct objects cannot overlap.

And here "overlap" is defined in terms of sizeof. The sizeof(T) bytes starting at address &foo cannot coincide with any of the corresponding bytes of any other object bar. This includes any padding bytes that either object may contain. And distinct members of a struct (other than bitfields) are distinct objects for this purpose.

For a struct, this means that an object which modifies a struct is allowed to freely modify its padding bytes, if the compiler finds it convenient to do so. With your struct A and struct B examples, we could imagine:

void copy(struct A *dst, const struct A *src) {
    *dst = *src;
}

The compiler is allowed to compile this into a single 64-bit load/store pair, which copies not only the 6 bytes of actual data but also the 2 bytes of padding. If it couldn't do that, it would have to compile it as a 32-bit copy plus a 16-bit copy, which would be less efficient.

Perhaps an even better example is that you are also allowed to copy a struct A by doing memcpy(&y, &x, sizeof(struct A)), which will more obviously copy 8 bytes, or a byte-by-byte copy of sizeof(struct A) bytes as in copy_3_to_4 above.

And it is legal to do:

struct A foo = { 42 };
struct B bar;
bar.b1 = 17;
copy(&bar.b0, &foo);
assert(bar.b1 == 17); // should be unchanged

If you wanted to have sizeof(struct B) == 8, then the b1 member would have to exist within the padding of the b0 member. So if copy(&bar.b0, &foo) does a 64-bit copy then it would overwrite it. We can't require that copy handle this case specially, because it could be compiled in an entirely separate file, and has no way of knowing whether its argument exists within some larger object. And we also can't tell the programmer they can't do copy(&bar.b0, &foo); the object bar.b0 is a bona fide object of type struct A and is entitled to all the rights and privileges of any object of that type.

So the only way out of this dilemma is for sizeof(struct B) to be larger than 8. And since its required alignment is still 4 (as inherited from struct A, as inherited from uint32_t), then necessarily sizeof(struct B) must be 12 or more.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • Thanks for the thorough answer. There would indeed be some minor changes needed. You would need to account for both `sizeof(A)` and `alignof(A)` when doing arrays and allocations. Arrays (and things like qsort) would need to switch to using something like `sizeofindex(A)` (which would just be `align(sizeof(A), alignof(A)`) I think your point #2 is a bit wrong, the `sizeof(A)` would return `6` so the objects would NOT overlap. Your point that it would require two operations to zero it _would_ be a performance hit though, which is one good reason why this shouldn't be the default option. – vitiral Jun 13 '23 at 17:33
  • Sure, you can *imagine* a language where size and alignment are independent concepts, but that language would not be C. Features like `packed` change the ABI, but they don't fundamentally change the language - you only have to recompile your code (and any libraries that it uses). Your proposed feature would require you to not only recompile but actually rewrite substantial amounts of code. So I think it answers your original question as to why your C compiler does not have an option for it. – Nate Eldredge Jun 13 '23 at 20:05
  • 1
    @vitiral: It's true that if somehow you could get `sizeof(A) == 6` then point #2 would be moot. But in point #1 I proved that you cannot, without breaking other language guarantees. – Nate Eldredge Jun 13 '23 at 20:08
  • Right, that's exactly what I'm trying to say (along with `alignof(A) == 4`). Sorry I'm so bad at saying it! – vitiral Jun 13 '23 at 20:48
0

This is neither safe nor recommended, it's just what needs to be done since C has zero support for a struct of some size that requires a non-power-of-2 alignment

Thanks to @KamilKuk for pointing out you can pack and use alignas on each field. However, I still couldn't quite make it work for my use-case.

It looks like the only thing that can be done is manually specify the alignments as bytes.

#include <stdio.h>                                                                        
#include <stdint.h>
  
struct __attribute__(( __packed__ )) A {                                                  
  uint32_t a0; // 4 bytes                                                                 
  uint16_t a1; // 2 byte                                                                  
};                                                                                        
                                                                                          
struct __attribute__(( __packed__ )) B {                                                  
  struct A b0; // 6 bytes                                                     
  uint16_t b1; // 2 bytes                                                                 
};                                                                                        
                                                                                          
struct __attribute__(( __packed__ )) C { // total size:                                   
  struct A c0;  uint8_t __align0[2];                                                      
  struct A c1;                                       
};                                                                                        
                                                                                          
// Use this inside arrays, otherwise arrays will be mis-aligned.                          
struct __attribute__(( __packed__ )) C_El { struct C el; uint8_t __align0[2]; };          
                                                                                          
int main() {                                                                              
  printf("sizeof(A)    = %u\n", sizeof(struct A));    // 6
  printf("sizeof(B)    = %u\n", sizeof(struct B));    // 8
  printf("sizeof(C)    = %u\n", sizeof(struct C));    // 14
                                                                                          
  struct C arrC[4]; ssize_t arr0 = (ssize_t)arrC;
  printf("C index pointers: %u, %u\n", 0, (ssize_t)(&arrC[1]) - arr0);                    
                                                                                          
  struct C_El arrC_El[4]; ssize_t arr0el = (ssize_t)arrC_El;                              
  printf("C_El index pointers: %u, %u\n", 0, (ssize_t)(&arrC_El[1]) - arr0el);            
  return 0;                                                                               
}

I wish there was an __attribute__(( __packed_aligned__ )) or similar for the whole struct (and sub-structs) to preserve alignment but not waste bytes. Unfortunately that really does seem to be missing.

vitiral
  • 8,446
  • 8
  • 29
  • 43
  • As I commented above, if you are imagining a `packed_aligned` that would let you specify an alignment that is not a divisor of the size, then that is fundamentally at odds with basic principles of the language. It would make `qsort` impossible, for instance. – Nate Eldredge Jun 12 '23 at 21:05
  • qsort is only impossible if you don't take the alignof into consideration when calculating index size. This would break some current assumptions, but is a far cry from "impossible" IMO. See my response to your answer and thanks for filling in my knowledge. – vitiral Jun 13 '23 at 17:43
  • 1
    I mean it would be impossible to implement with its current API, which only passes the size and not the alignment. So this would be a serious breaking change - you'd not only have to recompile all existing code, you'd also have to rewrite a lot of it. – Nate Eldredge Jun 13 '23 at 20:01
  • Well, you would only have to worry about it for structs which you added `__attribute__(( efficient ))` or whatever. So you wouldn't break any code by adding the feature, but would have to be careful when using structs with this attribute added (and/or add lints/etc to check for those bugs). – vitiral Jun 13 '23 at 20:42