6

A struct with bitfields, even when "packed", seems to treat a bitfield's size (and alignment, too?) based on the specified int type. Could someone point to a C++ rule that defines that behavior? I tried with a dozen of compilers and architectures (thank you, Compiler Explorer!) and the result was consistent across all.

Here's the code to play with: https://godbolt.org/z/31zMcnboY

#include <cstdint>

#pragma pack(push, 1)
struct S1{ uint8_t  v: 1; }; // sizeof == 1
struct S2{ uint16_t v: 1; }; // sizeof == 2
struct S3{ uint32_t v: 1; }; // sizeof == 4
struct S4{ unsigned v: 1; }; // sizeof == 4
#pragma pack(pop)

auto f(auto s){ return sizeof(s); }

int main(){
    f(S1{});
    f(S2{});
    f(S3{});
    f(S4{});
}

The resulting ASM clearly shows the sizes returned by f() as 1, 2, 4 for S1, S2, S3 respectively: enter image description here

eerorika
  • 232,697
  • 12
  • 197
  • 326
YePhIcK
  • 5,816
  • 2
  • 27
  • 52
  • 1
    As far as I understand it, bit fields just let adjacent members share bits of a common representation. I don't think a bitfield can practically be smaller than its underlying type. `uint16_t v: 1;` is still a `uint16_t` and should be willing to share 15 of its unused bits with another `uint_16_t` bifield member but their is no such other member to share them with. But I think how (and even if) bitfields are implemented is entirely up to the compiler so there is probably not a hard C++ rule that requires this. – François Andrieux Apr 28 '22 at 17:37
  • My understanding is that you are reserving a quantity of bits depending on the type. For `uint8_t`, you are reserving 8 bits, whether you use 1 or all 8. Likewise with `uint16_t`, you are reserving 16 bit. If you have consecutive bit fields of the same type, the compiler will do its best at compacting the bits within the space of the declaration type. – Thomas Matthews Apr 28 '22 at 17:46
  • @FrançoisAndrieux my underlying question was related to bit sharing between `bool :1`, `uint8_t :2`, `uint16_t :11` which, all combined, should have fit into 16 bits but, alas, the size of that struct was 3 bytes, not 2 – YePhIcK Apr 28 '22 at 17:52
  • @YePhIcK A `uint16_t` likely has an alignment requirement of 2, so has to start on a 16 bit boundary. The 5 bits after the `uint8_t` would then likely be lost. – François Andrieux Apr 28 '22 at 17:56
  • The C++ rule is that it's up to the implementation do decide what to do, and it's up to the implementation whether it documents what it does. – Pete Becker Apr 28 '22 at 17:59
  • 1
    @PeteBecker `and it's up to the implementation whether it documents what it does` The standard requires that implementation documents behaviour that is specified as implementation-defined. It's up to the implementation to decide *how* to document it though. – eerorika Apr 28 '22 at 18:13
  • @eerorika -- having just looked it up (don't rely on memory!), allocation and alignment are implementation-defined. Seems to me they didn't used to be, but I wouldn't swear to it. – Pete Becker Apr 28 '22 at 18:20
  • 1
    @PeteBecker I just checked C++98 and C89 drafts; wording is different in C but implementation-definedness is already there. False memory, perhaps. – eerorika Apr 28 '22 at 18:32

2 Answers2

4

Could someone point to a C++ rule that defines that behavior?

Nothing about #pragma pack(push, 1) is specified by the standard (other than #pragma being specified as a pre-processor directive with implementation defined meaning). It is a language extension.

This is what the standard specifies regarding bit fields:

[class.bit]

... A bit-field shall have integral or (possibly cv-qualified) enumeration type; the bit-field semantic property is not part of the type of the class member. ... Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.

It's essentially entirely implementation defined or unspecified.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • so the "consistency" of all the existing implementations is, in essence, more or less just a coincidence? – YePhIcK Apr 28 '22 at 17:54
  • 1
    @YePhIcK It may be that you tested each compiler on the same underlying architecture like `x86_64` and whatever strategies these compilers use happens to be the best one for that architecture. – François Andrieux Apr 28 '22 at 17:55
  • @FrançoisAndrieux I tried ARM7/8 x86, x64, RISC, and some others - all showed consistent behavior – YePhIcK Apr 28 '22 at 17:57
0

Minimum size of a bit-field sequence is that of its underlying type. Multiple adjacent bit-fields of same underlying type are packed to the minimum number of word of underlying type, without fracturing any bit-field. a bit-field of 0 size indicates an explicit break and following fields start from next word. mixed underlying types result in breaks at the point of divergence. Bits are not the minimum addressable unit on most machines and size of data types is measured in units of octets as the smallest addressable memory unit.

Red.Wave
  • 2,790
  • 11
  • 17
  • 1
    I think the question is why *"Minimum size of a bit-field sequence is that of its underlying type"* has to be true. – François Andrieux Apr 28 '22 at 17:58
  • @Red.Wave all you said is true (from my understanding, experience, and observation). My question is to find a specific rule in the standard that describes that behavior – YePhIcK Apr 28 '22 at 17:59
  • 1
    None of this is required by the language definition. This may well describe the behavior of some implementation, or even of many implementations, but it's entirely up to the compiler writers. – Pete Becker Apr 28 '22 at 18:11