3

I was running code quality check on my C project, which involves structures with bit fields. I came across a situation which, as per MISRA C 2004 standards, rule # 6.4 - is a violation, that reads as follows: "6.4 Bit fields shall only be defined to be of type unsigned int or signed int." Literature available on Microsoft Developer Network here asserts this. Can anyone explain as to why the datatypes of the bitfield member needs to be signed or unsigned int? Why am I not allowed to do the following, even though, the following code would compile with no warnings:

typedef struct
{
    char a: 4;
    char b: 4;
    char flag: 1;
}MY_STRUCT
sdevikar
  • 437
  • 6
  • 20
  • Sounds like a readability issue. Most people do not expect a bitfield to have type `char`. – merlin2011 Jun 02 '14 at 04:35
  • 2
    You are a member of the standards committee. Explain in full-blown standardese what should happen when someone writes `char x: 9`, and why, in an implementation where CHAR_BIT==8. Convince other committe members that your rule is substantially better than theirs. Dunno about you, but to me it sounds like a wasted weekend. – n. m. could be an AI Jun 02 '14 at 04:47
  • 1
    maybe because `char` can be signed or unsigned depending on system – phuclv Jun 02 '14 at 04:48
  • 1
    http://stackoverflow.com/questions/2280492/bit-fields-of-type-other-than-int?rq=1 – phuclv Jun 02 '14 at 04:48
  • 2
    @n.m. The relevant section of the standard is §6.7.2.1 ¶4: _The expression that specifies the width of a bit-field shall be an integer constant expression with a nonnegative value that does not exceed the width of an object of the type that would be specified were the colon and expression omitted._ Unless `CHAR_BIT` is at least 9, `char x: 9` violates this requirement. – Jonathan Leffler Jun 02 '14 at 15:32
  • @n.m.: The helpful way for the Standard to allow bitfields to be defined would be to specify a field of a "containing" type, and then specify the bitfields as specified portions of that (something like `struct foo { uint8_t flags; wowzo=flags.0:1; bonzo=flags.1:1; scale=flags.4:4;};` (assuming syntax of name=container.offset:size). That could be 100% unambiguous and portable independent of integer sizes (attempting to use bit fields in excess of the container size would be a constraint violation). – supercat Jun 24 '16 at 21:13

2 Answers2

4

The primary reason is that if you don't explicitly state signed or unsigned, you don't know whether the type will be treated as signed or unsigned except by reading the implementation's definition of what it does. That means that portable code cannot be written without using the types with an explicit signedness keyword. (Note, too, that using char for a bit-field type designator is using an implementation-defined type.)

ISO/IEC 9899:2011 — §6.7.2.1 Structure and union specifiers

A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type.

A bit-field is interpreted as having a signed or unsigned integer type consisting of the specified number of bits.125)

125)As specified in 6.7.2 above, if the actual type specifier used is int or a typedef-name defined as int, then it is implementation-defined whether the bit-field is signed or unsigned.

That refers to:

§6.7.2 Type specifiers

¶4 The expression that specifies the width of a bit-field shall be an integer constant expression with a non-negative value that does not exceed the width of an object of the type that would be specified were the colon and expression omitted.

¶5 … except that for bitfields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Thanks for the elaborate answer. However, I was confused more about the selection of datatype (int as against char) than its signed or un-signed-ness. I mean, doesn't it make more sense to use a char if I need only 4 bits? Besides, if I am to use library functions like memcpy, use of chars is much safer than ints (to avoid implicit padding by the compiler). Also, to me, using int makes the code less portable, if anything. Correct me if I am wrong, maybe I am missing something or misunderstanding something. – sdevikar Jun 03 '14 at 03:54
  • The standard only specifies that `_Bool`, `int`, `unsigned int` and `signed int` must be supported as the types for the bit-field; anything else (`char`, `long`, etc) is supported at the whim of the implementation. Using `char` is less reliable than `int`, but (for bit-fields only), plain `int` can mean `unsigned int` or `signed int`, again at the whim of the implementation. The behaviour must be documented, but the implementation can choose either. So, using plain `int` is less portable than either `signed int` or `unsigned int`; using `char` instead of `int` is less portable too. – Jonathan Leffler Jun 03 '14 at 03:59
  • Note that `char` is a different type from either `signed char` or `unsigned char`, but it has the same range of values as one of those two types. So plain `char` can also be signed or unsigned as the implementation chooses (but the choice in the context of a bit-field would be the same as everywhere else in the implementation). – Jonathan Leffler Jun 03 '14 at 04:04
0

I didn't write the standard, so I can only speculate.

That said, my speculation would be that is is confusing to use integer types of different widths when the width actually doesn't make a difference (since you specify the number of bits anyway). For example:

char a : 4
short a : 4
int a : 4

all declare the same thing, so there is no reason to allow for confusion by having different ways of writing it.

Strigoides
  • 4,329
  • 5
  • 23
  • 25
  • Those do not all declare the same thing. If char/short/int/long long are 8/16/32/64 bits, then a type with 21 fields of `char a:3` will pack two into each of ten bytes, and store the last in a byte by itself (eleven bytes total); one with 21 fields of type `short a:3` will pack five into each of four shorts, and put the last in a short by itself (ten bytes total); one with 21 fields of type `int a:3` would pack six each into two ints, and three into a third int (twelve bytes total); twenty-one fields of `long long a:3` would be placed into a single 64-bit value (eight bytes total). – supercat Nov 02 '14 at 21:36