0

I was going through the text "The C Programming Language" by Kernighan and Ritchie. While discussing about bit-fields at the end of that section, the authors say:

"Fields are assigned left to right on some machines and right to left on others. This means that although fields are useful for maintaining internally-defined data structures, the question of which end comes first has to be carefully considered when picking apart externally-defined data; programs that depend on such things are not portable."

- The C Programming Language [2e] by Kernighan & Ritchie [Section 6.9, p.150]

Strictly I do not get the meaning of these lines. Can anyone please explain me with a possible diagram?


PS: Well I have taken a computer organization and architecture course. I know how computers deal with bits and bytes. In a computer system, the smallest unit of information is a single bit which can be either 0 or 1. 8 such bits form a byte. Memories are byte-addressable, which means that each byte in the memory has an address associated with it. But usually, the processors have word lengths as 2 bytes (very old systems),4 bytes, 8 bytes... This means in one memory cycle, the CPU can take up a word length number of bytes from the main memory and put it inside its registers. Now how these bytes are placed in registers depends on the endianness of the system.

But I do not get what the authors mean by "left to right" or "right to left". The words seem like they are related to the endianness but endianness depends on the CPU and C compilers have nothing to do with it... The question which comes to my mind is "left to right" of "what"? What object are the authors referring to?

Lover of Structure
  • 1,561
  • 3
  • 11
  • 27
Abhishek Ghosh
  • 597
  • 7
  • 18
  • Please do not downvote my question without stating any reason or suggested editing. Please do suggest what is wrong with my question – Abhishek Ghosh Jul 26 '21 at 20:08
  • 1
    The order of bits in a bitfield is not mandated by the C standard, so the compiler implementation might arrange them as they will. That is pretty much what it is saying. – Eugene Sh. Jul 26 '21 at 20:09
  • @EugeneSh. please could you elaborate your comment with an illustrative answer. It shall be helpful in understanding the concept better – Abhishek Ghosh Jul 26 '21 at 20:10
  • 1
    Related: [Order of fields when using a bit field in C](https://stackoverflow.com/questions/19376426/order-of-fields-when-using-a-bit-field-in-c) (possible duplicate). – Weather Vane Jul 26 '21 at 20:12
  • @user3386109 yes i do – Abhishek Ghosh Jul 26 '21 at 20:18
  • @user3386109 in a computer system, the smallest unit of information is a single bit which can be either 0 or 1. 8 such bits form a byte. Memories are byte addressable, Which means that each byte in the memory has an address associated with it. But usually the processors have word length as 2 bytes (very old systems),4 bytes, 8 bytes... Which means in one memory cycle, the CPU can take up word length number of bytes from the main memory and put it inside its registers. Now how these bytes are placed in registers depends on the endian-ness of the system. – Abhishek Ghosh Jul 26 '21 at 20:27
  • 1
    @AbhishekGhosh Yes, very good. But you omitted an important piece of information. A byte has a most significant bit, and a least significant bit, and additional bits arranged from most significant to least significant. In English-speaking countries, it's typical (when writing a byte as a list of bits) to put the most significant bit on the left, and the least significant bit on the right. – user3386109 Jul 26 '21 at 20:34
  • 2
    Which means that if bit-fields are assigned "left to right", then the first bit field occupies the most significant bits of the underlying storage unit. Whereas, if bit-fields are assigned "right to left", then the first bit field occupies the least significant bits. – user3386109 Jul 26 '21 at 20:38
  • @user3386109 sorry... that I omitted that part... not that it was not known to me, But rather just forgot to include it as there were word limits on a single comment and I did not really understand, how much should I write to convey what I know.:( – Abhishek Ghosh Jul 26 '21 at 20:40
  • @AbhishekGhosh Hey try to put a bit more effort on your question or edit it, not trying to kick you, but it feels very sloppy. – Miguel M Jul 26 '21 at 23:33
  • 1
    @MiguelM please have a look. How does it sound now? – Abhishek Ghosh Jul 27 '21 at 07:54
  • 1
    @AbhishekGhosh Much better – Miguel M Jul 27 '21 at 12:14

3 Answers3

3

When you write:

struct {
    unsigned int version: 4;
    unsigned int length: 4;
    unsigned char dcsn;

you end up with a big headache you weren't expecting because your code is non-portable.

When you set version to 4 and length to 5, some systems may set the first byte of the structure to 0x45 and other systems may set the first byte of the structure to 0x54.

When I went to college this thing was #ifdef'd as follows (incorrect):

struct {
#if BIG_ENDIAN
    unsigned int version: 4;
    unsigned int length: 4;
#else
    unsigned int length: 4;
    unsigned int version: 4;
#endif
    unsigned char dcsn;

but this is still rolling the dice as there's no rule that the order of the bits in the bytes in a bitfield corresponds to the order of bytes in the word in the machine. I would not be surprised that when you cross-compile the bit order in the struct comes from the host machine's rules while the bit order of integers comes from the target machine's rules (as it must). In theory the code could be corrected by having a separate #ifdef for BIG_ENDIAN_BITFIELD but I've never seen it done.

Joshua
  • 40,822
  • 8
  • 72
  • 132
  • 1
    You get a smaller big headache if you omit the terminating `};` – wildplasser Jul 26 '21 at 20:22
  • @wildplasser: You really want to see the rest of the structure? – Joshua Jul 26 '21 at 20:23
  • @TomKarzes: In general you are correct but I have happened to have picked an example where I know it won't do that because the next element isn't a bitfield. Edited it in for clarity. – Joshua Jul 26 '21 at 20:27
  • 1
    @TomKarzes: The order of bytes within a multibyte object and the order of bit-fields within a storage unit are different things. The fact that a processor may store a `uint32_t` from a register to memory with its high-value byte in a low address does not necessarily have any relationship with how the compiler decides to order bit-fields within a storage unit, as it will typically be using shifts and other bit operations to access those fields. (And this answer presents a poor example by conflating them, even though it states there is no rule there is a correspondence.) – Eric Postpischil Jul 26 '21 at 20:28
  • 1
    @TomKarzes: I have. dcsn is always in the second byte, both in big and little endian processors. – Joshua Jul 26 '21 at 20:30
  • @Joshua Hm, it looks like you're right. I think this is different from the traditional K&R way bit fields were packed into a structure. I guess at some point they decided there was no reason not to pack things more tightly. My recollection is that they didn't used to mix bit-fields and non-bit-fields in the same integer type that was used to declare the bit fields. – Tom Karzes Jul 26 '21 at 20:34
  • 1
    @TomKarzes You're thinking of *another* way the layout of structures containing bitfields is implementation defined. The standard permits, but does not require, an implementation to pad the bitfields in this structure to the full width of `unsigned int`. And that could mean padding in between the bitfields and `dcsn`, or *before* the bitfields, or even *between the two bitfields*. There's no rules. (The psABI, however, probably does give some rules.) – zwol Jul 26 '21 at 21:45
  • (Rant alert) The code is perfectly portable as long as you don't try to type-pun the bitfields. If you're using the bitfields just to pack several narrow data, there's no problem and imho the code is much more readable than shift-and-mask macros. The binary data don't interoperate, so you can't move a packed struct to another machine, but neither do floating-point types. It's not going to work as a serialisation technique, which requires interop. But for many applications, it's perfect and the constant denigration is unjustified. – rici Jul 27 '21 at 03:26
  • @rici: It's the start of the IP header struct circa 1999. It didn't always work. – Joshua Jul 27 '21 at 03:35
  • @joshua: describing communications protocols is not a use case for bitfields. But that doesn't mean they don't have their use. There's a difference between portability and interoperability, and bitfields are certainly portable. Code written with a bitfield can be compiled and run on a different architecture. – rici Jul 27 '21 at 03:57
3

When a structure contains bit-fields, the C implementation uses some storage unit to hold them (or multiple storage units if needed). The storage unit might be one eight-bit byte or it might be four bytes, or it might be other sizes—this is a determination made by each C implementation. The C standard only requires that it be addressable, which effectively means it has to be a whole number of bytes.

Once we have a storage unit, it is some number of bits. Say it is 32 bits, and number the bits from 31 to 0, where, if we consider the bits to represent a binary numeral, bit 0 represents 20, and bit 31 represents 231. Note that Kernighan and Ritchie are imprecise to use “left” and “right” here. There is no inherent left or right. We usually write numerals with the most significant digits on the left, so we might consider bit 31 to be the leftmost and bit 0 to be the rightmost.

Now we have a storage unit with some number of bits and some labeling for those bits (31 to 0 or left to right). Say you want to put two bit-fields in them, say fields of width 7 and 5.

Which 7 of the bits from bit 31 to bit 0 are used for the first field? Which 5 of the bits are used for the second field?

We could use bits 31-25 for the first field and bits 24-20 for the second field. Or we could use bits 6-0 for the first field and bits 11-7 for the second field.

In theory, we could also use bits 27-21 for the first field and bits 15-11 for the second field. However, the C standard does say that “If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit” (C 2018 6.7.2.1 11). “Adjacent” is not formally defined, but we can assume it means consecutively numbered bits. So, if the C implementation puts the first field in bits 31-25, it is required to put the second field in bits 24-20. Conversely, it it puts the first field in bits 6-0, it must put the second field in 11-7.

Thus, the C standard requires an implementation to arrange successive bit-fields in a storage unit from left-to-right or from right-to-left, but it does not say which.

(I do not see anything in the standard that says the first field must start at one end of the storage unit or the other, rather than somewhere in the middle. That would lead to wasting some bits.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
2

Here is some demonstration code. The only goal is to demonstrate what you are asking about. Clean coding etc. is neglected.

#include <stdio.h>
#include <stdint.h>

union
{
    uint32_t Everything;
    struct 
    {
        uint32_t FirstMentionedBit : 1;
        uint32_t FewOTherBits      :30;
        uint32_t LastMentionedBit  : 1;
    } bitfield;
} Demonstration;

int main()
{
    Demonstration.Everything               =0;
    Demonstration.bitfield.LastMentionedBit=1;
    
    printf("%x\n", Demonstration.Everything);

    Demonstration.Everything                =0;
    Demonstration.bitfield.FirstMentionedBit=1;
    
    printf("%x\n", Demonstration.Everything);

    return 0;
}

If you use this here https://www.tutorialspoint.com/compile_c_online.php the output is

80000000
1

But in other environments it might easily be

1
80000000

This is because compilers are free to consider the first mentioned bit the MSB or the LSB and correspondingly the last mentioned bit to be the LSB or MSB.
And that is what your quote describes.

Yunnosch
  • 26,130
  • 9
  • 42
  • 54