standardising bit fields using unions (making them portable)

Question

The page shown is from "Programming Embedded Systems" by Michael Barr, Anthony Massa, published by O'Reilly, ISBN: 0-596-00983-6, page 204.

enter image description here

I am asking you to give more details and explanations on this, like:

Does this means that the bit fields are going to be portable across all the compilers?
For (different) architectures does this work for bit fields with sizes more than one byte or (considering the endianness difference which I don't think that using this method will overcome this problem)?
For (same) architectures does this work for bit fields with sizes more than one byte?
If they are standardised across all compilers as the book says, can we specify how are they going to be aligned?
Q.1.2 if the bit fields are just one byte so the endianness problem won't affect it, right? So will the bit fields be portable across all the compilers on different architectures and endiannesses?

That's a nonsense approach. If you want to access single bits in a hardware register, us bitops like masking and shifting. A good compiler wil use the same code it uses for bitfileds if you use the ideomatic syntax. This is no way portable. There is not even a gurantee `uint8_t` bitfields are allowed. — too honest for this site, May 04 '17 at 13:49
Almost everything about the layout of bit fields is 'implementation defined'. Regarding Q5: although you don't have to worry about the endianness of single bytes — endianness is a phenomenon of how multiple bytes are laid out in memory — you still aren't guaranteed whether the first bit cited in the bit field structure will be in bit 0 or bit 7 of the byte: both are feasible, and maybe some other options too, as long as the compiler documents it. Take-away: ***bit fields are not portable!*** Any claim to the contrary is incorrect. And unionizing bit fields doesn't make them more portable. — Jonathan Leffler, May 04 '17 at 14:26
@JonathanLeffler As one example, the alignment of a so called "storage unit" is explicitly unspecified behavior. Meaning that the compiler _does not_ have to document it and we can't know anything about it. This alone makes bit-fields non-portable, by language design. — Lundin, May 04 '17 at 14:43
The bitfields don't have to appear in the same order within the byte as you declared them, so that puts your whole idea to bed — M.M, May 05 '17 at 00:03

score 4 · Answer 1 · edited May 23 '17 at 10:31

4

does this means that the bit fields are going to be portable across all the compilers

No, the cited text about unions somehow making bit-fields portable is strange. union does not add any further portability guarantees what-so-ever. There are many aspects of bit-fields that make them completely non-portable, because they are very poorly specified by the standard. Some examples here.

For example, using uint8_t or a char type for a bit-field is not covered by the standard. The book fails to mention this even though it makes such a non-standard example.

for (different) architectures does this work for bit fields with sizes more than one byte

for (same) architectures does this work for bit fields with sizes more than one byte

No, there are no guarantees at all.

if they are standardised across all compilers as the book says ,can we specify how are they going to be aligned?

They aren't, the book is misleading. My advise is to stop reading at "bitfields are not portable" then forget that you ever heard about bit-fields. It is a 100% superfluous feature anyway. Instead, use bitwise operators.

edited May 23 '17 at 10:31

Community

1
1

answered May 04 '17 at 12:47

Lundin

195,001
40
254
396

2

Mostly, the standard says 'implementation defined'. It's not so much that the behaviour of bit fields is undefined as that the standard says that the implementation must define how it deals with them. The ins and outs are intricate. The problem for the standardizing committee was that there was wildly different behaviours displayed by the compilers at the time when the standard was being created, and those compilers worked, and the committee went out of their way to avoid gratuitously making compiler behaviour invalid. – Jonathan Leffler May 04 '17 at 14:20
1

@JonathanLeffler First of all, implementation-defined behavior isn't really helpful, since there are no restrictions and compilers may do any manner of crazy stuff that they want (which they also do). I am aware that originally the committee had this nonsense restriction saying that they were not allowed to favour any existing technique over another. Which is of course complete BS, because the result is that bit-fields cannot even be reliably used in practice. --> – Lundin May 04 '17 at 14:35
But there is also undefined behavior as in things the standard simply forgot to specify. Just yesterday there was [this question](http://stackoverflow.com/questions/43735053/unexpected-behavior-of-bit-field) about why two bit-fields of incompatible types wouldn't merge. The standard doesn't mention what will happen at all. Aka implicit undefined behavior. The compiler is pretty much free to run off into the woods and still be standard compliant. Bit-fields are simply useless, by language design. – Lundin May 04 '17 at 14:37

score 2 · Answer 2 · answered May 04 '17 at 23:32

That is a very disturbing text, I think I would toss the whole book.

Please go read the spec, you dont have to pay for it there are older ones available and draft versions (which of course are not the final) versions available, but for that couple of decades they are more common that different.

If I could add more upvotes to Lundin's answer I would, not worth creating new users with email addresses just to do that...

I have/will possibly spark an argument on this, but...The spec does say that if you define some number of (non-zero sized) bitfields in a row in a struct or union they will get packed, and there is a special zero sized one that is used to break that so that you can declare a bunch of bitfields and group them without having to make some other struct.

Perhaps it says they will be aligned, but I would never assume, that. I know for a fact the same compiler will treat endians different and pack them on opposite ends (top bit down or bottom bit up). But there is no reason to assume that any compiler follows any convention other than packing, and I would assume although perhaps it is also subject to interpretation, that they are packed in the order defined once you figure out where they start. So I wouldnt assume that 6 bits worth of declaration are aligned either way, could be up to six different alignents within a byte assuming a byte is the size of the unit, if the size of the unit is 32 or 64 bits then I am not going to bother counting the combinations, it is more than one and that is all that matters. I know for a fact from gcc that when the 32 to 64 bit x86 transition happened, that caused problems with code making assumptions on where those bits landed.

I personally wouldnt even assume that the bits are in the declared order when they pack them together...Popular compilers tend to do that, but the spec does not say more than they are packed...what does that mean. If I have a 1 then an 8 then a 1 then an 8 then a 6 bit I would hope the compiler alignes the 8 bit ones on a byte boundary and then moves the two ones near the 6, if I were to ever use a bitfield which I dont...

The prime contention here, is to me the spec is very clear that the initial items in more than one declaration in a union only uses the same memory if the order and size are the same they are compatible types. A one bit unsigned it is not the same as a 32 bit unsigned int, so they are NOT compatible types IMO. The spec goes further to state that, for bitfields the types have to be the same type and size, so for a bitfield to share the same memory in a union, you need two structures with the same initial bitfield items to be the same type and size, and only those items are per spec going to share memory, what happens with the rest of the bits is a different story, per the spec. so your example from my reading of the spec does nothing to say that the 8 bit char (using a made up non-spec declaration) and 8 declared bits of bit field are in no way expected to line up with each other and share the same memory. Just because a compiler chooses to do that in some version does not mean you can assume that, the union in particular does not make that code portable or more portable in anyway, in fact perhaps worse as now you not only have a bitfield issue across compilers or versions, but you now have union issues across compilers or versions.

As a general rule NEVER use a structure across a compile domain (with or without bitfields)(this includes unions), so never read a file into a structure, you have crossed a compile domain, never point structures at hardware registers, you have crossed a compile domain, never point structures at memory, dont point a structure at a char array that contains an ethernet packet and use the struct and/or bitfields to pick apart the ip header. Yes, these rules are widely used and abused and are a ticking time bomb. The primary reason the time bomb only goes off rarely is that the code keeps using the same compiler, and or a couple of vary popular compilers that currently have the same implementation. but struct pointing in general fails very very often, bitfield failures are just a side effect of that, and perhaps because of the horrible text in your book, unions are starting to show up a lot making the time bomb now nuclear instead of conventional.

So if you want to use a struct or a union or a bitfield and have the code actually work without maintenance. Then stay within the same compile domain (one program compiled at the same time with the same compiler and settings), pass structures defined as structures across functions, do not point at memory or other arrays, etc. for unions never access across individually defined items, if a single variable only use that variable until finished completely with it assume it is now trash if you use a struct or other variable in that union. With bitfields each variable is a standalone item independent of the other variables next to it, you are just ATTEMPTING to save memory by using them but you are actually wasting a lot of code overhead, performance, code space by using them in the first place. Keep to that and your code is far more likely to work without maintenance, across compilers. Now if you want to use this as job security and have your code fail to build or function every minor or major release of a compiler, then do those things above, point structs across a compile domain, point bitfields at hardware registers, etc. Other than your boss noting that you write horrible code that breaks often when some other employees dont, you will have to keep maintaining that code on a regular basis for the life of that product.

All the compiler does with your bitfield is generate masks and shifts, if you write those masks and shifts yourself, MASSIVELY more portable, you may still have endian issues (which can actually at times be easily solved in portable endian-less code) but you wont be completely pointing at the wrong thing using masking and shifting. it simply works, it does not produce more code, does not produce slower code, if you really need make macros for everything, using a macro to isolate a field within a "unit" is far more portable than using a bitfield.

Forget you ever read about bitfields or heard about them, never ever use them again. The need for them died decades ago. Unions somewhat fall into the same category as well, they do actually save memory but you have to be careful to share that memory properly.

And I would toss that book as well, if they dont understand this simple topic what else do those authors not understand. As with most folks this may be confusing a popular compilers interpretation with reality.

score 0 · Answer 3 · answered May 04 '17 at 13:48

0

There is a little 'bit' confusion with the concepts of bit-fields and endians - Suppose you have an MCU of 32bit - it means that internal memory of device have to be multiplied with a size of 32bits. Now as you might know, the way each MCU stores the memory is LSB or MSB which is Big Endian and Little Endian respectively, see here for illustrations: Endians figure As can be seen, same data: 0x12345678 (32 bit value) is stored differently internally. When you are reading and writing memory using a 32 bit pointer (trivial) case - the MCU will convert it for you (it doesn't makes any difference between the endians) - the problem arose when you are dealing with a one by one byte manipulation or when exporting (or importing) from other MCU / memory peripheral also suing 8 bit / 1 byte manipulations.

Bit field will be aligned to the Byte, Word and to the Long Word types (as seen) so it can be miss-interpreted when porting to other target.

Hence, the answer your questions:

If it is only one byte that you dividing into bits it will be ported nicely. if you define a multi-bytes union it will make you goes in trouble.
Answered at the introduction of this answer
See answer no. 1
See the figure I have attached for illustrations
Right in general

answered May 04 '17 at 13:48

Itzik Chaimov

89
1
10

Please provide a reference to the standard guaranteeing a specific layout for bitfields in the underlying type. Or where it even guarantees types other than `unsigned`, `int` or `_Bool`. – too honest for this site May 04 '17 at 13:52
It is not so simple as just dismissing this as an endianess issue. The _bit order_ of a bit field is not specified either. Nor is the bit alignment. Nor is the byte alignment/padding. Nor are the use of any other types than `int` or `_Bool`. Nor signedness (not even for `int`). Nor are the "overlapping of storage units" (as C calls them). And so on and so on. Bit-fields are broken by language design. – Lundin May 04 '17 at 14:20
Byte is the basic type of data structure. – Itzik Chaimov May 04 '17 at 15:04
Byte is the basic type of data structure. Hence, when specifying 0xF0 we agree that it is 240in decimal. so when defining a union (or struct) with say two nibbles ( nibbles are 4 bits data structures, each) then if assigning first nibble to 0x0 and second value to 0xF the value of this byte must be 0xF0 - so bit fields are definitions rather language design. (in same logic you would assign the 8 bits as 11110000 for which zeros are from bit0 to bit3 and F's from bit4 to bit7 ) – Itzik Chaimov May 04 '17 at 15:09
My intentions were to help Ali Mak with the confusion he have regarding the Endians and the Bit-fields – Itzik Chaimov May 04 '17 at 15:15
@ItzikChaimov: We are not a tutoring site, but a Q&A site. An answer should answer the question in the first place. You can add additional information, but that should be related to the question. – too honest for this site May 04 '17 at 18:34
Olaf - answering by an example is a great help. See my answers to the 5 questions as asked by OP. Do you agree that i did Ans. Over his Quest. – Itzik Chaimov May 04 '17 at 18:49

score -2 · Answer 4 · edited May 04 '17 at 14:29

-2

1, 2: Not quite: it always depends on the platform (endianess) and the types you are using.

3: Yes they will always land on the same spot in memory.

4: Which alignment do you mean — the memory alignment or the field alignment?

edited May 04 '17 at 14:29

Jonathan Leffler

730,956
141
904
1,278

answered May 04 '17 at 12:06

maxbit89

748
1
11
31

I mean how the bits are going to be arranged in the the byte in the memory ,like from left to right or right to left ? about Q.1.2 if the bit fields are just one byte so the endianness problem won't affect it, right? so will the bit fields be portable across all the compilers on different arcitecture endianness ? – Ali Mak May 04 '17 at 12:31

standardising bit fields using unions (making them portable)

4 Answers4