Sizes of bit fields and unions in C

Question

I have the following code:

#pragma pack(push, 1)
typedef struct __attribute__((packed)){
    uint64_t msg: 48;
    uint16_t crc: 12;
    int : 0;
} data_s;
#pragma pack(pop)

typedef union {
    uint64_t tot;
    data_s split;
} data_t;

int main() {
    data_t data;
    printf(
        "Sizes are: union:%d,struct:%d,uint64_t:%d\n",
        sizeof(data),
        sizeof(data.split),
        sizeof(data.tot)
    );
    return 0;
}

The output I get is Sizes are: union:16,struct:10,uint64_t:8.

Here I have two issues,

Even though I'm using bit fields and trying to pack it, I am getting 10 bytes even though the number of bits is less than 64(48+12=60) and can be packed into 8 bytes.
Even though the maximum size of the two members of the union is 10, why is its size 16?

Also how do I pack the bits into 8 bytes?

Related see: https://stackoverflow.com/questions/15136426/memory-layout-of-struct-having-bitfields — Jonny Schubert, Jan 25 '18 at 14:07
why are you using int : 0; at last? This notation causes next bit field to be on next allocation boundy. — sameer chaudhari, Jan 25 '18 at 14:16
Yes. I wanted the two members of the union to have equal lengths — Prateek Dhanuka, Jan 25 '18 at 14:20
Does putting int: 0 at last gurantees two member to be equal length ? I thinkk it should be in between msg and crc member @PrateekDhanuka. — sameer chaudhari, Jan 25 '18 at 14:40
you're getting UB. [To print `size_t` you must use `%zu`](https://stackoverflow.com/q/940087/995714), not `%d` — phuclv, Feb 20 '18 at 05:12

score 1 · Answer 1 · answered Jan 25 '18 at 14:07

This is implementation defined; how bits are laid out depends on your compiler.

Many compilers split bitfields if they are different types. You could try changing type of crc to uint64_t to see if it makes a difference.

If you want to write portable code and layout is important, then don't use bitfields at all.

score 1 · Accepted Answer · answered Jan 25 '18 at 14:10

1

You are allocating an integral type and then tell how many bits to use.

Then you allocate another integral type and tell how many bits to use.

The compiler places these in their respective integrals. To have them in a single integral field, use comma's to separate them, e.g.:

uint64_t msg: 48, crc: 12;

(But note the implementation defined aspect user694733 mentions)

answered Jan 25 '18 at 14:10

Paul Ogilvie

25,048
4
23
41

@PrateekDhanuka It's not really about the comma, but rather about using the same integral type for both. – HolyBlackCat Jan 25 '18 at 14:35
The standard provides absolutely no reason to expect that a compiler would treat the members differently when they are declared as you suggest instead of as the OP presented. In particular, the standard assigns no significance whatever for structure layout to the type specifier, and it insists that adjacent bitfields be packed into the same addressable storage unit if that unit is large enough. Since the observed behavior is non-conforming, we can only guess about what's going on, but it is far more likely that variance in the bitfields' declared types is significant. – John Bollinger Jan 25 '18 at 14:36
Yes I understood that! Thanks for your help, everybody! – Prateek Dhanuka Jan 25 '18 at 14:45

score 1 · Answer 3 · answered Jan 25 '18 at 16:33

These are bit-fields. They are very poorly covered by standardization. If you use them - you are on your own.

#pragma pack(push, 1) typedef struct __attribute__((packed)){
These are non-standard compiler extensions of the gcc compiler. What happens when you add them is not covered by any standard. The only thing the standard says is that if a compiler doesn't recognize the #pragma, it must ignore that line.
The C standard only guarantees that the types _Bool, unsigned int and signed int are valid for bit-fields. You use uint64_t and uint16_t. What happens when you do is not covered by the C standard - this is implementation-defined behavior. The standard speaks of "units", but it is not specified how large a "unit" is.
msg: 48; The C standard does not specify if this is the least significant 48 bits or the most significant ones. It does not specify order of allocation, it does not specify alignment. Add endianess on top of that, and you can't really know what this code does.
All the C standard guarantees is that msg resides on a lower address than trailing struct members. Unless they are merged into the same bit-field - then the standard guarantees nothing. Completely implementation-defined.
int : 0; is useless to add at the end of a bit-field, the only purpose of this code is to the compiler not to merge any trailing bit-field into the previous one.
#pragma pack and similar doesn't, as far as I know, guarantee that there is no trailing padding in the end of the struct/union.
gcc is known to behave strange together with bit-fields. It has this in common with every single C compiler ever written.

The answer to your questions can thus be summarized as: because bit-fields.

An alternative approach which will be 100% deterministic, well-defined portable and safe is something like this:

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>

typedef uint64_t data_t;

static inline uint64_t msg (data_t* data)
{
  return *data >> 12;
}

static inline uint64_t crc (data_t* data)
{
  return *data & 0xFFFu;
}

int main() {
    data_t data = 0xFFFFFAAAu;

    printf("msg: %"PRIX64" crc:%"PRIX64, msg(&data), crc(&data));
    return 0;
}

This is even portable across CPU:s of different endianess.

Antonin GAVREL · Answer 4 · 2018-01-25T14:13:34.670

0

For your question 2. : A union always takes as much space as its largest member. Here it considers the struct split to be of size 10 and then you probably have optimization flag when you compile to align memory (which is recommended), making it a power of 2 (from 10 to 16).

edited Jan 25 '18 at 14:13

answered Jan 25 '18 at 14:09

Antonin GAVREL

9,682
8
54
81

score 0 · Answer 5 · answered Jan 25 '18 at 14:32

Even though I'm using bit fields and trying to pack it, I am getting 10 bytes even though the number of bits is less than 64(48+12=60) and can be packed into 8 bytes.

Note in the first place that #pragma pack, like most #pragmas, is an extension with implementation-defined behavior. The C language does define its behavior.

In the second place, C affords implementations considerable freedom with respect to how they lay out the contents of a structure, especially with respect to bitfields. In fact, supposing that uint64_t is a different type from unsigned int in your implementation, whether you can even have a bitfield of the former type in the first place is implementation-defined.

C does not leave it completely open, however. Here's the key part of the specification for bitfield layout within a structure:

An implementation may allocate any addressable storage unit large enough to hold a bit- field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

(C2011, 6.7.2.1/11; emphasis added)

Note well that C does not say that the declared type of a bitfield member has anything to do with the size of the addressable storage unit in which its bits are stored (neither there nor anywhere else), though in fact some compilers do implement such behavior. On the other hand, what it does say certainly leads me to expect that if C accepts a 48-bit bitfield in the first place then an immediately-following 12-bit bitfield should be stored in the same unit. Implementation-defined packing specifications don't even enter the picture. Thus, your implementation seems to be non-conforming in this regard.

Even though the maximum size of the two members of the union is 10, why is its size 16?

Unions can have trailing padding, just like structures can. Padding will have been introduced into the union's layout to support the compiler's idea of optimal alignment for objects of that type and their members. In particular, it is likely that your structure has at least an 8-byte alignment requirement, so the union is padded to a size that is a multiple of that alignment requirement. This is again implementation-defined, and as long as we're there, it's possible that you could avoid the padding by instructing the compiler to pack the union, too.

Also how do I pack the bits into 8 bytes?

It may be that you can't, but you should check your compiler's documentation. Since the observed behavior appears to be non-conforming, it's anyone's guess what you can or must do.

If it were me, though, the first thing I would try is to remove the pragma and attribute; remove the zero-length bitfield, and change the declared type of the crc bitfield to match that of the preceding bitfield (uint64_t). The point of all of these is to clear away details that conceivably might confuse the compiler, so as to try to get it to render the behavior that the standard demands in the first place. If you can get the struct to come out as 8 bytes, then you probably don't need to do anything more to get the union to come out the same size (on account of 8 being a somewhat magic number).

"Thus, your implementation seems to be non-conforming in this regard." Not really, because the C standard does not guarantee the behavior of other bit-field types than `int` and `_Bool`. What happens when you use uint64_t or uint16_t is anyone's guess: this is implementation-defined. — Lundin, Jan 25 '18 at 14:34
I disagree, @Lundin. Although it is implementation-defined whether a bitfield of type `uint64_t` is supported, if it is then there's no justification for supposing that such bitfields are excused from the general specifications for bitfields. I have quoted the relevant section of the standard in my answer. A conforming implementation must choose an ASU for the first bitfield large enough to accommodate all its bits. If there's enough space left over for the second, then a conforming implementation must put the bits of the second into the same ASU as the first. — John Bollinger, Jan 25 '18 at 14:48
The C standard speaks of "storage units". What is a "unit"? How large is it? Nobody knows. — Lundin, Jan 25 '18 at 14:51
And so I say the implementation *seems* to be non-conforming, @Lundin. For a conforming implementation to *not* pack the two bitfields into the same ASU, and also to have unused bits in the first ASU, it must choose an ASU larger than 48 bits and smaller than 60. I think we can be confident that the OP's machine is not addressible at sub-8-bit granularity, so the only possible choice would be 56 bits. From the sizes, that does not appear to be what the OP's implementation is actually doing. — John Bollinger, Jan 25 '18 at 15:03
Well, I happen to know that gcc behaves very poorly unless you use the same type for all bit fields. Essentially it always treats different types as different bit-fields, making this code behave just as if one had written `uint64_t msg: 48; int : 0; uint16_t crc: 12;`. It is hard to argue about compliance to the standard here, since the standard is complete trash when it comes to bit-fields. Suppose gcc says that "ok here you go, have 16 bytes as your addressable storage unit. I'll allocate your bit-fields inside it, in an implementation-defined order". Fully standard compliant. — Lundin, Jan 25 '18 at 15:16
@Lundin, the standard does afford implementations quite a bit of freedom in how they lay out bitfields and sequences of bitfields, and that does make it hard to argue about conformance in this area. Also complicating the discussion are other implementation freedoms, such adding padding, and the extensions in play. But I don't see a plausible and internally consistent way for gcc to do as you describe while conforming to both the standard and its own documentation. Nevertheless, that doesn't lower my regard for gcc very much. — John Bollinger, Jan 25 '18 at 15:30

Sizes of bit fields and unions in C

5 Answers5