Bit manipulations good practices

Question

As a beginner C programmer, I am wondering, what would be the best easy-to-read and easy-to-understand solution for setting control bits in a device. Are there any standards? Any example code to mimic? Google didn't give any reliable answer.

For example, I have a control block map:

The first way I see would be to simply set the needed bits. It requires a bunch of explanations in comments and seems to be not all that professional.

DMA_base_ptr[DMA_CONTROL_OFFS] = 0b10001100;

The second way I see is to create a bit field. I'm not sure if this is the one should I stick to, since I never encountered it being used in such way (unlike the first option I mentioned).

struct DMA_control_block_struct
{ 
    unsigned int BYTE:1; 
    unsigned int HW:1; 
    // etc
} DMA_control_block_struct;

Is one of the options better than the other one? Are there any options I just don't see?

Any advice would be highly appreciated

By the way, using `0b` for base-two constants is nonstandard. — Steve Summit, Nov 02 '18 at 14:04
As far as Standard C is concerned, you've got leading `0x` for hexadecimal, or leading `0` for octal, else decimal. It's a pretty frequent wish that there be a way to enter base-two constants, and leading `0b` is the obvious moniker (which is evidently implemented by some compilers), but as I said, it's not Standard. — Steve Summit, Nov 03 '18 at 02:41
Also, there are nine bits shown, so this register must be something larger than an ordinary byte. It might be good to indicate the length of the register (or whatever it is). You can indicate bits by their usual hex mask values (0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, etc). Maybe include the full length, as 0x0001, 0x0002, etc.? — MPW, Nov 03 '18 at 12:06
You didn't mention whether the code you're writing is targeted at a Regular Computer (e.g., as a device driver) or an embedded system. Conventions differ substantially, and between platforms (Linux driver standards are not quite the same as Windows, though they're more similar than embedded AVR). — chrylis -cautiouslyoptimistic-, Nov 03 '18 at 17:01

dbush · Accepted Answer · 2018-11-02T14:40:39.390

The problem with bit fields is that the C standard does not dictate that the order in which they are defined is the same as the order that they are implemented. So you may not be setting the bits you think you are.

Section 6.7.2.1p11 of the C standard states:

An implementation may allocate any addressable storage unit large enough to hold a bit- field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

As an example, look at the definition of struct iphdr, which represents an IP header, from the /usr/include/netinet/ip.h file file on Linux:

struct iphdr
  {
#if __BYTE_ORDER == __LITTLE_ENDIAN
    unsigned int ihl:4;
    unsigned int version:4;
#elif __BYTE_ORDER == __BIG_ENDIAN
    unsigned int version:4;
    unsigned int ihl:4;
#else
# error "Please fix <bits/endian.h>"
#endif
    u_int8_t tos;
    ...

You can see here that the bitfields are placed in a different order depending on the implementation. You also shouldn't use this specific check because this behavior is system dependent. It is acceptable for this file because it is part of the system. Other systems may implement this in different ways.

So don't use a bitfield.

The best way to do this is to set the required bits. However, it would make sense to define named constants for each bit and to perform a bitwise OR of the constants you want to set. For example:

const uint8_t BIT_BYTE =     0x1;
const uint8_t BIT_HW   =     0x2;
const uint8_t BIT_WORD =     0x4;
const uint8_t BIT_GO   =     0x8;
const uint8_t BIT_I_EN =     0x10;
const uint8_t BIT_REEN =     0x20;
const uint8_t BIT_WEEN =     0x40;
const uint8_t BIT_LEEN =     0x80;

DMA_base_ptr[DMA_CONTROL_OFFS] = BIT_LEEN | BIT_GO | BIT_WORD;

*It is acceptable for this file because it is part of the system.* It's also "acceptable" because Linux pretty much *de facto* requires GCC to compile. A different *compiler* is free to change how the bit fields are assigned even if the endianness remains the same. — Andrew Henle, Nov 02 '18 at 13:45
C compilers on Unix-like systems are expect to conform not just to the C standard but also to the platform's ABI so they can interoperate with the platform's libraries. — plugwash, Nov 02 '18 at 20:10
Why not use `enum` instead of defining potentially ODR-problematic constant variables? — Ruslan, Nov 02 '18 at 20:58
@Ruslan Presumably because with enumerations you have no control over what integer type they are implemented as. — Will, Nov 03 '18 at 01:30
You can write various tests for your bitfields and structs, etc. Either as normal runtime tests or as static_assert macros. Then if the bits are not where expected, report the error and stop. — Zan Lynx, Nov 03 '18 at 15:47

score 19 · Answer 2 · answered Nov 02 '18 at 13:31

19

Other answers have already covered most of the stuff, but it might be worthwhile to mention that even if you can't use the non-standard 0b syntax, you can use shifts to move the 1 bit into position by bit number, i.e.:

#define DMA_BYTE  (1U << 0)
#define DMA_HW    (1U << 1)
#define DMA_WORD  (1U << 2)
#define DMA_GO    (1U << 3)
// …

Note how the last number matches the "bit number" column in the documentation.

The usage for setting and clearing bits doesn't change:

#define DMA_CONTROL_REG DMA_base_ptr[DMA_CONTROL_OFFS]

DMA_CONTROL_REG |= DMA_HW | DMA_WORD;    // set HW and WORD
DMA_CONTROL_REG &= ~(DMA_BYTE | DMA_GO); // clear BYTE and GO

answered Nov 02 '18 at 13:31

Arkku

41,011
10
62
84

3

For beginners: the parentheses in the macros such as ```#define DMA_BYTE (1U << 0)``` are extremely important - see [this question](https://stackoverflow.com/questions/7186504/c-macros-and-use-of-arguments-in-parentheses). – mgarey Nov 02 '18 at 17:43
@mgarey I'd say they are important for all C developers not just beginners. Insufficient use of parentheses in a macro I'd say is a bug in the macro regardless of who you intend to be using the macro. – kasperd Nov 03 '18 at 11:36
2

@kasperd I think the point was that non-beginners have already been bitten by this and thus learned to put parentheses in their macros. =) – Arkku Nov 03 '18 at 11:38

Steve Summit · Answer 3 · 2018-11-02T13:54:37.030

18

The old-school C way is to define a bunch of bits:

#define WORD  0x04
#define GO    0x08
#define I_EN  0x10
#define LEEN  0x80

Then your initialization becomes

DMA_base_ptr[DMA_CONTROL_OFFS] = WORD | GO | LEEN;

You can set individual bits using |:

DMA_base_ptr[DMA_CONTROL_OFFS] |= I_EN;

You can clear individual bits using & and ~:

DMA_base_ptr[DMA_CONTROL_OFFS] &= ~GO;

You can test individual bits using &:

if(DMA_base_ptr[DMA_CONTROL_OFFS] & WORD) ...

Definitely don't use bitfields, though. They have their uses, but not when an external specification defines that the bits are in certain places, as I assume is the case here.

See also questions 20.7 and 2.26 in the C FAQ list.

edited Nov 02 '18 at 13:54

answered Nov 02 '18 at 12:54

Steve Summit

45,437
7
70
103

Can't you just determine the bit order by referencing the `stdlib.h` and `stddef.h` header files for a particular system? – Clay Raynor Nov 02 '18 at 13:03
4

I see no fundamental problem with using bit fields to match hardware registers on a particular embedded platform, for example, since the code tends to be inherently non-portable in any case (tied to that specific device and often either a single compiler). The gains in readability and convenience for multi-bit fields in particular can be worth it. (Of course there may be other problems, such as code size or performance, that need to be checked, but my point is that I wouldn't disregard bit fields for this use automatically.) – Arkku Nov 02 '18 at 13:13
Just be careful when using `~` because it contains an implicit integer promotion. Good practice is to cast back to the intended type: `DMA_base_ptr[DMA_CONTROL_OFFS] &= (uint8_t)~WRITEACCESS;` – Lundin Nov 02 '18 at 13:37
1

Thank you for the answer, I find the details on usage extremely helpful and will definitely use a thing or two – KateOleneva Nov 02 '18 at 13:39
1

@Arkku, ClayRaynor: In the end it's something of a matter of style. In my opinion, trying to get an in-memory data structure to conform to an externally-imposed storage layout is simply more trouble than it's worth. It may be a minority opinion, though, because certainly, vast numbers of C programmers spend vast amounts of time trying to arrange such conformances. (And sometimes, of course, they succeed, including when they're matching individual bits with bitfields.) – Steve Summit Nov 02 '18 at 13:49
1

Well, it is true that using bit fields to match hardware makes the code non-portable (in theory perhaps even to different compilers for the same hardware), so I would agree that the default should be to not use them for this. But at the same time I think the premise of matching bits in a hardware register is a sign that this code might be so non-portable anyways that the addition of bit fields to the mix would not be so serious. For 1-bit fields I wouldn't personally do it anyway, but for 2+-bit fields in a one-off non-portable project, I might at least consider it just for the nice syntax. =) – Arkku Nov 02 '18 at 13:56
1

@Arkku, Steve Summuit I would have to agree with both of your sentiments. I am all for trying to maximize portability. But I don't think portability should be the primary concern since you are working with code that is hardware dependent. I also understand and agree with the troubles of matching external storage constrains. – Clay Raynor Nov 02 '18 at 17:29
1

@Arkku: A problem with bitfields is that hardware may have restrictions on how read-modify-write operations are performed, but a compiler will generally have no way of knowing about or honoring them. For example, if a bit field occupies bits 8-15 of a word, a compiler might try to replace a store with a simple byte, even when accessing a hardware register which doesn't support 8-bit accesses (and which would thus require a 16-bit read-modify-write sequence). On the flip side, if a 16-bit I/O register supports byte writes, and if main-line and interrupt code need to write opposite halves... – supercat Nov 02 '18 at 21:37
...using byte accesses may avoid the need to have the mainline disable interrupts while writing its half of the register. Given a sequence like `FOO->CR1 = (FOO->CR1 & ~0xFF) | someValue;` the need to watch out for simultaneous actions in an ISR may be somewhat self-evident, but if it were written as `FOO->someMember = 123;` the fact that such an operation could conflict with a write to `FOO->someOtherMember` would hardly be obvious. – supercat Nov 02 '18 at 21:40
@supercat Valid points, certainly, especially the one about leaky abstraction obfuscating something relevant. Anyway, I'm not advocating using bitfields by default (indeed I advise above that the default should be to _not_ use them), I'm just mainly arguing against the _categorical_ "definitely don't use bitfields" in this answer. – Arkku Nov 02 '18 at 21:52

score 8 · Answer 4 · answered Nov 02 '18 at 13:11

There is no standard for bit fields. Mapping and bit operation are dependent on compiler in this case. Binary values such as 0b0000 are not standardized also. Usual way to do is defining hexadecimal values for each bit. For example:

#define BYTE (0x01)
#define HW   (0x02)
/*etc*/

When you want to set bits, you can use:

DMA_base_ptr[DMA_CONTROL_OFFS] |= HW;

Or you can clear bits with:

DMA_base_ptr[DMA_CONTROL_OFFS] &= ~HW;

Kuba hasn't forgotten Monica · Answer 5 · 2018-11-04T17:23:02.573

Modern C compilers handle trivial inline functions just fine – without overhead. I’d make all of the abstractions functions, so that the user doesn’t need to manipulate any bits or integers, and is unlikely to abuse the implementation details.

You can of course use constants and not functions for implementation details, but the API should be functions. This also allows using macros instead of functions if you’re using an ancient compiler.

For example:

#include <stdbool.h>
#include <stdint.h>

typedef union DmaBase {
  volatile uint8_t u8[32];
} DmaBase;
static inline DmaBase *const dma1__base(void) { return (void*)0x12340000; }

// instead of DMA_CONTROL_OFFS
static inline volatile uint8_t *dma_CONTROL(DmaBase *base) { return &(base->u8[12]); }
// instead of constants etc
static inline uint8_t dma__BYTE(void) { return 0x01; }

inline bool dma_BYTE(DmaBase *base) { return *dma_CONTROL(base) & dma__BYTE(); }
inline void dma_set_BYTE(DmaBase *base, bool val) {
  if (val) *dma_CONTROL(base) |= dma__BYTE();
  else *dma_CONTROL(base) &= ~dma__BYTE();
}
inline bool dma1_BYTE(void) { return dma_BYTE(dma1__base()); }
inline void dma1_set_BYTE(bool val) { dma_set_BYTE(dma1__base(), val); }

Such code should be machine generated: I use gsl (of 0mq fame) to generate those based on a template and some XML input listing the details of the registers.

Maybe I'm weird, but if I'm dealing with low-level stuff like DMA control, I much prefer to see the bits myself than have them wrapped into `bool`s, and indeed ones that I cannot read or set more than one at a time. (And if the idea is to offer a true higher level API, then the (exported) functions should also be higher level than `set_BYTE`. At least in name.) — Arkku, Nov 02 '18 at 14:52
@Arkku Of course there can be a higher level API, and setting multiple bits at a time would be dealt with there. Presumably only some combinations of bits are useful, although that of course varies. Enforcing type safety, I.e. not using dma bit patterns on a uart, is a bit of a problem in C... — Kuba hasn't forgotten Monica, Nov 02 '18 at 20:07

score 1 · Answer 6 · answered Nov 02 '18 at 20:52

You could use bit-fields, despite what all the fear-mongers here have been saying. You would just need to know how the compiler(s) and system ABI(s) you intend your code to work with define the "implementation defined" aspects of bit-fields. Don't be scared off by pedants putting words like "implementation defined" in bold.

However what others so far seem to have missed out on are the various aspects of how memory-mapped hardware devices might behave that can be counter-intuitive when dealing with a higher-level language like C and the optimization features such languages offer. For example every read or write of a hardware register may have side-effects sometimes even if bits are not changed on the write. Meanwhile the optimizer may make it difficult to tell when the generated code is actually reading or writing to the address of the register, and even when the C object describing the register is carefully qualified as volatile, great care is required to control when I/O occurs.

Perhaps you will need to use some specific technique defined by your compiler and system in order to properly manipulate memory-mapped hardware devices. This is the case for many embedded systems. In some cases compiler and system vendors will indeed use bit-fields, just as Linux does in some cases. I would suggest reading your compiler manual first.

The bit description table you quote appears to be for the control register of the the Intel Avalon DMA controller core. The "read/write/clear" column gives a hint as to how a particular bit behaves when it is read or written. The status register for that device has an example of a bit where writing a zero will clear a bit value, but it may not read back the same value as was written -- i.e. writing the register may have a side-effect in the device, depending on the value of the DONE bit. Interestingly they document the SOFTWARERESET bit as "RW", but then describe the procedure as writing a 1 to it twice to trigger the reset, and then they also warn Executing a DMA software reset when a DMA transfer is active may result in permanent bus lockup (until the next system reset). The SOFTWARERESET bit should therefore not be written except as a last resort. Managing a reset in C would take some careful coding no matter how you describe the register.

As for standards, well ISO/IEC have produced a "technical report" known as "ISO/IEC TR 18037", with the subtitle "Extensions to support embedded processors". It discusses a number of the issues related to using C to manage hardware addressing and device I/O, and specifically for the kinds of bit-mapped registers you mention in your question it documents a number of macros and techniques available through an include file they call <iohw.h>. If your compiler provides such a header file, then you might be able to use these macros.

There are draft copies of TR 18037 available, the latest being TR 18037(2007), though it provides for rather dry reading. However it does contain an example implementation of <iohw.h>.

Perhaps a good example of a real-world <iohw.h> implementation is in QNX. The QNX documentation offers a decent overview (and an example, though I would strongly suggest using enums for integer values, never macros): QNX <iohw.h>

re. using `enum` instead of macros, one benefit of macros is that they can include a cast to a specific type (such as matching the width of a hardware register), whereas the actual type of an `enum` is implementation defined. (And, yes, you can make the same argument here as with bit fields that it is not a problem if you know how the implementation is defined, and that is a valid argument. =) — Arkku, Nov 02 '18 at 22:04
Well, an `enum` value is always given as an `int` and the type it is represented as when used must be compatible with `int`, so for these purposes it really is still effectively just an `int`. Also I would argue strongly against including casts in macro definitions. You can add the cast at the time you use the enum, or at the time you use a constant (regardless of whether it is from a mcro or not), if it is necessary, though normally such casts are just superfluous noise for us humans to have to read through and figure out if they actually so anything different than if they were not there. — Greg A. Woods, Nov 02 '18 at 22:27

score -2 · Answer 7 · answered Nov 02 '18 at 12:58

You should make sure to initialize the bits to a known default value when you declare the variable to store their values. In C, when you declare a variable you are just reserving a block of memory at a address and the size of the block is based its type. If you don't initialize the variable you can encounter undefined / unexpected behavior since the value of the variable will be effected by whatever the value / state of the memory in that block was before you declared it. By initializing the variable to a default value, you are clearing this block of memory of its existing state and putting it in a know state.

As far as readability, you should use a bit field to store the values of the bit. A bit field enables you to store the values of the bits in a struct. This makes it easier to organize since you can use dot notation. Also, you should make sure to comment the declaration of the bit field to explain what the different fields are used for as a best practice. I hope this answers your question. Good luck with you C programming!

Bit fields are *extremely* non-portable. Any compiler can do what it wants. Per [**6.7.2.1 Structure and union specifiers**, paragraph 11 of the C standard](https://port70.net/~nsz/c/c11/n1570.html#6.7.2.1p11): "... whether a bit-field that does not fit is put into the next unit or overlaps adjacent units **is implementation-defined**. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) **is implementation-defined**. The alignment of the addressable storage unit **is unspecified**." — Andrew Henle, Nov 02 '18 at 13:51
You should be checking the definitions in your `stddef.h` and `limits.h` header files anyway since the size of your integer primitives is platform specific and your bit shift operations can be effected by the system's Endianness. Plus a compiler's manual should specify the behavior of bit fields. Also, this is hardware specific so portability is already out the window. — Clay Raynor, Nov 02 '18 at 15:18

Bit manipulations good practices

7 Answers7

Linked