15

I have a struct of the following type

typedef struct
{
unsigned int a : 8;
unsigned int b : 6;
unsigned int c : 2;
}x, *ptr;

What i would like to do, is change the value of field c.

I do something like the following

x structure = { 0 };
x->c = 1;

When I look at the memory map, I expect to find 00 01, but instead I find 00 40. It looks like when arranging the second byte, it puts c field in the lowest bits and b field in the highest bits. I've seen this on both GCC and Windows compilers.

For now, what I do is the following, which is working OK.

unsigned char ptr2 = (unsigned char*) ptr
*(ptr2 + 1)  &= 0xFC
*(ptr2 + 1)  |= 0x01

Am I looking at the memory map wrong? Thank you for your help.

fashasha
  • 481
  • 2
  • 7
  • 19
  • 1
    How do you display the *00 40* value? – ouah Oct 15 '13 at 08:34
  • 1
    Assuming `ptr` holds `&structure` (which is not clear in your question) `*(ptr+1)` is a quick walk into **undefined behavior**. – WhozCraig Oct 15 '13 at 08:40
  • @WhozCraig - Sorry, the code is not complete here, I've performed the casting to unsigned short in order to move 1 by every time. – fashasha Oct 15 '13 at 08:45
  • @ouah - I've checked the memory map in Visual Studio, and printed the values of variable / fields later on. – fashasha Oct 15 '13 at 08:46
  • @fashasha ok. The `*ptr` as a type and `ptr` as a variable tossed my head. I think i see what you're doing. `ptr` is (now *was*) an `unsigned char*` in the dereference code. – WhozCraig Oct 15 '13 at 08:50
  • Does this answer your question? [\_\_LITTLE\_ENDIAN\_BITFIELD and \_\_BIG\_ENDIAN\_BITFIELD?](https://stackoverflow.com/questions/18070977/little-endian-bitfield-and-big-endian-bitfield) – Sam Protsenko Aug 17 '23 at 20:07

6 Answers6

20

C standard allows compiler to put bit-fields in any order. There is no reliable and portable way to determine the order.

If you need to know the exact bit positions, it is better use plain unsigned variable and bit masking.

Here's one possible alternative to using bit-fields:

#include <stdio.h>

#define MASK_A    0x00FF
#define MASK_B    0x3F00
#define MASK_C    0xC000
#define SHIFT_A   0
#define SHIFT_B   8
#define SHIFT_C   14

unsigned GetField(unsigned all, unsigned mask, unsigned shift)
{
    return (all & mask) >> shift;
}

unsigned SetField(unsigned all, unsigned mask, unsigned shift, unsigned value)
{
    return (all & ~mask) | ((value << shift) & mask);
}

unsigned GetA(unsigned all)
{
    return GetField(all, MASK_A, SHIFT_A);
}

unsigned SetA(unsigned all, unsigned value)
{
    return SetField(all, MASK_A, SHIFT_A, value);
}

/* Similar functions for B and C here */

int main(void)
{
    unsigned myABC = 0;
    myABC = SetA(myABC, 3);
    printf("%u", GetA(myABC)); // Prints 3
}
user694733
  • 15,208
  • 2
  • 42
  • 68
  • 13
    C99 6.7.2.1-11:An implementation may allocate any addressable storage unit large enough to hold a bit- field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified. – WhozCraig Oct 15 '13 at 08:47
  • @WhozCraig - Yeah, sorry, as I mentioned in the first reply to you, I forgot to add this to the post. In any case, thank you for pointing the line from the standard. I guess using bit masking will be sufficient way for **setting** fields. I guess I can still use the structure for getting the values of the fields. – fashasha Oct 15 '13 at 08:51
  • Will the order of bits have any relation with endianess? – Ginu Jacob Nov 20 '14 at 09:56
  • 1
    @GinuJacob Some relation. If you do `(unsigned)number & 1`, it will always give you LSB, and endianness is irrelevant. **Byte-endianness:** If you inspect bytes of `number`, then byte which has LSB is implementation defined. On little-endian system, LSB can be found on first byte. **Bit-endianness:** It's impossible to know the order of bits in single byte in C. **In short:** You need to care about byte-edianness if you write code which converts data to bytes or back. You don't really need to care about bit-endianness, since C has no ability to take address of the single bit. – user694733 Nov 20 '14 at 10:26
  • I realize this is ugly, but is there a way to define the order behaviour using GCC-specific macros? Do compilers provide this facility? – 9a3eedi May 16 '17 at 09:22
  • @9a3eedi Apparently the order is [determined by the ABI](https://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html) and changing that [can be problematic](http://stackoverflow.com/a/6728289/694733). – user694733 May 16 '17 at 09:30
  • This method is the most reliable if using across different hardware or different compilers. It is however inefficient in both code size and speed, so use this where portability and compatibility with other devices is important (e.g. sending this data over a communication link to another device and interpreting it with similar code at the remote end) but if the environment (processor and compiler) is consistent and performance is your priority then using the bitfield is more efficient. – user1582568 Feb 18 '21 at 06:30
  • 1
    @user1582568 I left the example fairly simple to make it easier to understand. After function inlining and optimizations are applied with good compiler, I expect performance to be either very close or exactly same. After all, bitfields need to do the same shifting and masking operations under the hood. – user694733 Feb 19 '21 at 13:50
  • @user694733 I agree that in theory a good compiler should be able to optimize your functions well, but having had considerable experiance with this issue, in practice I have found the bitfields ofter produce a better result. This is for 2 reasons. 1 - Compilers don't reliably inline the finctions. 2- The compiler can make better optimizations with the bitfield, for example for a 1 bit field it will use bit manipulation instructions whereas using masking functions it will not allway "notice" that this can be done. Microcontroller manufacturers tend to use the bitfield methed. – user1582568 Mar 16 '21 at 07:56
12

I know this is an old one, but I would like to add my thoughts.

  1. Headers in C are meant to be usable across objects, which means the compiler has to be somewhat consistent.

  2. In my experience I have always seen bitfields in LSB order. This will put the bits in MSB->LSB order of c:b:a

The 16 bits, read in byte order is "00 40". Translated from Little-endian, this is a 16-bit value of 0x4000.

This is:

c == [15:14] == b'01
b == [13:8] == 0
a == [7:0] == 0
msc
  • 33,420
  • 29
  • 119
  • 214
jrelles
  • 129
  • 1
  • 2
  • 1
    Unfortunately our personal experience doesn't matter. The standard doesn't define how the compiler stores bitfields (C99 6.7.2.1-10 "The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.") so it could change at any time and on any platform. Relying on this sort of implementation-defined behaviour leads to subtle bugs which are very difficult to track down. Don't do this. See also the comment by @WhozCraig above about bitfield packing. – suprjami Sep 10 '22 at 05:14
4

You can depend on the ordering of bit-fields to be deterministic as long as your program is intentionally not cross-platform.

While the C specification does not dictate the ordering of bit-fields, the platform's ABI does. In the case of System V AMD 64, the allocation order of bit-fields is defined as "right to left" (section 3.1.2). For ARM 64, it depends on the endianness of the data type (section 8.1.8.2). Because your C compiler adheres to platform ABI of the target architecture you're compiling against, you can depend on a fixed allocation order of bit-fields within a given platform.

1

When I look at the memory map, I expect to find 00 01, but instead I find 00 40. It looks like when arranging the second byte, it puts c field in the lowest bits and b field in the highest bits. I've seen this on both GCC and Windows compilers.

Ignoring endianness for a second, if you just arrange all fields in a purely sequential stream of increasing bits, you get the allocation below in the "bitstream order" column, and you see the last field in your struct c is the last field in the stream. Now, chunking things up into bytes, you get four permutations including little endian order (which is actually identical to pure bitstream order) and two big endian orders.

Compilers targeting Windows only build for little endian machines (in the past, the kernel had to deal with some big endian ones, but all current compatible architectures are little endian - x86/x64/arm64), and writing 1 to field c (bit c0 below) would yield 1<<6 or 0b01000000 (0x40) in byte 1.

On big endian machines, there is more than possible ordering:

  1. allocate the bitfields in decreasing order from the most significant bit in the word to the least significant, and swap every byte within that word (with high bytes at low addresses and low bytes at high addresses). You can find this ordering in gcc on Linux with big endian machines.
  2. allocate the bitfields in increasing order from least significant bit to most significant, like LE, and swap whole bytes within the word. You can find this in the C51 8051 microcontroller compiler.
Absolute Bit Bitstream order Byte Bit in byte LE order BE order#1 BE order#2
0 a0 0 0 a0 a0 b0
1 a1 0 1 a1 a1 b1
2 a2 0 2 a2 a2 b2
3 a3 0 3 a3 a3 b3
4 a4 0 4 a4 a4 b4
5 a5 0 5 a5 a5 b5
6 a6 0 6 a6 a6 c0
7 a7 0 7 a7 a7 c1
8 b0 1 0 b0 c0 a0
9 b1 1 1 b1 c1 a1
10 b2 1 2 b2 b0 a2
11 b3 1 3 b3 b1 a3
12 b4 1 4 b4 b2 a4
13 b5 1 5 b5 b3 a5
14 c0 1 6 c0 b4 a6
15 c1 1 7 c1 b5 a7
Dwayne Robinson
  • 2,034
  • 1
  • 24
  • 39
0

memory always depends on the underlying machine structure (endianness) and on the strategy for packing/arranging the structure the compiler is performing.

Peter Miehle
  • 5,984
  • 2
  • 38
  • 55
-1

You set a C structure to raw bits at your peril.

You know that the bits are and what they mean, so you can fill out the fields of the structure. Yes it's more code than memcpy, but it won't break if someone adds a field, and if helps enforce bit-level specificity at the communcations level.

Malcolm McLean
  • 6,258
  • 1
  • 17
  • 18