58

Is it worthwhile using C's bit-field implementation? If so, when is it ever used?

I was looking through some emulator code and it looks like the registers for the chips are not being implemented using bit fields.

Is this something that is avoided for performance reasons (or some other reason)?

Are there still times when bit-fields are used? (ie firmware to put on actual chips, etc)

sashoalm
  • 75,001
  • 122
  • 434
  • 781
Russel
  • 595
  • 1
  • 4
  • 5
  • when implementing your own softfloat – Chen Li Feb 24 '19 at 08:39
  • 1
    I found a modern use case when you have an unordered_map that contains a fixed max number of entries. Set a bit field that is the entire size of the number of elements that could appear in the map and then set the bit value to true when 1 or more elements are saved into a specific map entry. By checking the bit before hashing into the map, it is possible to significantly optimize access because the bit operation can be faster that the hash lookup. – MoDJ Jan 21 '20 at 08:23

11 Answers11

34

Bit-fields are typically only used when there's a need to map structure fields to specific bit slices, where some hardware will be interpreting the raw bits. An example might be assembling an IP packet header. I can't see a compelling reason for an emulator to model a register using bit-fields, as it's never going to touch real hardware!

Whilst bit-fields can lead to neat syntax, they're pretty platform-dependent, and therefore non-portable. A more portable, but yet more verbose, approach is to use direct bitwise manipulation, using shifts and bit-masks.

If you use bit-fields for anything other than assembling (or disassembling) structures at some physical interface, performance may suffer. This is because every time you read or write from a bit-field, the compiler will have to generate code to do the masking and shifting, which will burn cycles.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • 4
    regarding the "burning cycles" issue, I have found that it was indeed faster to use the smallest integer type possible rather than using bitfields. Except for boolean flags (for which masking is easy and no shifting is required), I therefore agree with you :) – Matthieu M. Nov 22 '10 at 07:59
  • 1
    @Matthieu: I would imagine that in *most* circumstances, using `int` would be fastest, because it's the platform's native width. The exception to this would be if doing everything as `int` makes your data structures significantly bigger, causing cache misses, etc. – Oliver Charlesworth Nov 22 '10 at 10:13
  • 3
    @OliCharlesworth, the network little-endian or big-endian issue will make your using bit-field pass packet header failed. And the C++ standard also doesn't define the how bit-field stored, it is implement specific. And base on the performance of bit-field is not good, bit-field is useless. – ZijingWu Sep 09 '13 at 10:32
  • 7
    @ZijingWu, "Implementation-specific" (or "platform/compiler-dependent") does not make something useless. It just means that there are limited applications and you have to be careful. – pattivacek Nov 12 '13 at 16:58
  • 1
    @ZijingWu While I would not say "useless", bitfields are certainly of very limited use given that - as you say - their implementation-defined packing makes them of little or very non-portable utility for representing specific bitwise formats. So in Oliver's example here, one would have to determine (A) how their implementation packed IP header bits and (B) that this matched the target's way of doing things. It's often much more reliable to code your own bitwise handlers to do this in a totally predictable way - that doesn't _at least_ demand comments saying 'only compile with `x`cc on `y`arch' – underscore_d Jan 05 '16 at 13:01
  • You're talking about "burning cycles" during compile-time here, NOT run-time, right? `This is because every time you read or write from a bit-field, the compiler will have to generate code to do the masking and shifting, which will burn cycles.` – Gabriel Staples Nov 09 '20 at 23:31
25

One use for bitfields which hasn't yet been mentioned is that unsigned bitfields provide arithmetic modulo a power-of-two "for free". For example, given:

struct { unsigned x:10; } foo;

arithmetic on foo.x will be performed modulo 210 = 1024.

(The same can be achieved directly by using bitwise & operations, of course - but sometimes it might lead to clearer code to have the compiler do it for you).

caf
  • 233,326
  • 40
  • 323
  • 462
13

FWIW, and looking only at the relative performance question - a bodgy benchmark:

#include <time.h>
#include <iostream>

struct A
{
    void a(unsigned n) { a_ = n; }
    void b(unsigned n) { b_ = n; }
    void c(unsigned n) { c_ = n; }
    void d(unsigned n) { d_ = n; }
    unsigned a() { return a_; }
    unsigned b() { return b_; }
    unsigned c() { return c_; }
    unsigned d() { return d_; }
    volatile unsigned a_:1,
                      b_:5,
                      c_:2,
                      d_:8;
};

struct B
{
    void a(unsigned n) { a_ = n; }
    void b(unsigned n) { b_ = n; }
    void c(unsigned n) { c_ = n; }
    void d(unsigned n) { d_ = n; }
    unsigned a() { return a_; }
    unsigned b() { return b_; }
    unsigned c() { return c_; }
    unsigned d() { return d_; }
    volatile unsigned a_, b_, c_, d_;
};

struct C
{
    void a(unsigned n) { x_ &= ~0x01; x_ |= n; }
    void b(unsigned n) { x_ &= ~0x3E; x_ |= n << 1; }
    void c(unsigned n) { x_ &= ~0xC0; x_ |= n << 6; }
    void d(unsigned n) { x_ &= ~0xFF00; x_ |= n << 8; }
    unsigned a() const { return x_ & 0x01; }
    unsigned b() const { return (x_ & 0x3E) >> 1; }
    unsigned c() const { return (x_ & 0xC0) >> 6; }
    unsigned d() const { return (x_ & 0xFF00) >> 8; }
    volatile unsigned x_;
};

struct Timer
{
    Timer() { get(&start_tp); }
    double elapsed() const {
        struct timespec end_tp;
        get(&end_tp);
        return (end_tp.tv_sec - start_tp.tv_sec) +
               (1E-9 * end_tp.tv_nsec - 1E-9 * start_tp.tv_nsec);
    }
  private:
    static void get(struct timespec* p_tp) {
        if (clock_gettime(CLOCK_REALTIME, p_tp) != 0)
        {
            std::cerr << "clock_gettime() error\n";
            exit(EXIT_FAILURE);
        }
    }
    struct timespec start_tp;
};

template <typename T>
unsigned f()
{
    int n = 0;
    Timer timer;
    T t;
    for (int i = 0; i < 10000000; ++i)
    {
        t.a(i & 0x01);
        t.b(i & 0x1F);
        t.c(i & 0x03);
        t.d(i & 0xFF);
        n += t.a() + t.b() + t.c() + t.d();
    }
    std::cout << timer.elapsed() << '\n';
    return n;
}

int main()
{
    std::cout << "bitfields: " << f<A>() << '\n';
    std::cout << "separate ints: " << f<B>() << '\n';
    std::cout << "explicit and/or/shift: " << f<C>() << '\n';
}

Output on my test machine (numbers vary by ~20% run to run):

bitfields: 0.140586
1449991808
separate ints: 0.039374
1449991808
explicit and/or/shift: 0.252723
1449991808

Suggests that with g++ -O3 on a pretty recent Athlon, bitfields are worse than a few times slower than separate ints, and this particular and/or/bitshift implementation's at least twice as bad again ("worse" as other operations like memory read/writes are emphasised by the volatility above, and there's loop overhead etc, so the differences are understated in the results).

If you're dealing in hundreds of megabytes of structs that can be mainly bitfields or mainly distinct ints, the caching issues may become dominant - so benchmark in your system.

update from 2021 with an AMD Ryzen 9 3900X and -O2 -march=native:

bitfields: 0.0224893
1449991808
separate ints: 0.0288447
1449991808
explicit and/or/shift: 0.0190325
1449991808

Here we see everything has changed massively, the main implication being - benchmark with the systems you care about.


UPDATE: user2188211 attempted an edit which was rejected but usefully illustrated how bitfields become faster as the amount of data increases: "when iterating over a vector of a few million elements in [a modified version of] the above code, such that the variables do not reside in cache or registers, the bitfield code may be the fastest."

template <typename T>
unsigned f()
{
    int n = 0;
    Timer timer;
    std::vector<T> ts(1024 * 1024 * 16);
    for (size_t i = 0, idx = 0; i < 10000000; ++i)
    {
        T& t = ts[idx];
        t.a(i & 0x01);
        t.b(i & 0x1F);
        t.c(i & 0x03);
        t.d(i & 0xFF);
        n += t.a() + t.b() + t.c() + t.d();
        idx++;
        if (idx >= ts.size()) {
            idx = 0;
        }
    }
    std::cout << timer.elapsed() << '\n';
    return n;
}

Results on from an example run (g++ -03, Core2Duo):

 0.19016
 bitfields: 1449991808
 0.342756
 separate ints: 1449991808
 0.215243
 explicit and/or/shift: 1449991808

Of course, timing's all relative and which way you implement these fields may not matter at all in the context of your system.

Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
  • On many platforms the bit-field structs and the custom mask&shift struct are 4 byte in size whereas the non-bitfield struct is 16 bytes. So performance/size is about equal for the bitfield and separate implementation (it's a tradeoff for you to decide on) whereas the performance/size ratio of the custom mask&shift struct is lower than the others but at least it's platform independent. – Tom Apr 22 '17 at 09:00
  • Shouldn't you be compiling with `-march=native`? – einpoklum Oct 04 '18 at 21:52
9

I've seen/used bit fields in two situations: Computer Games and Hardware Interfaces. The hardware use is pretty straight forward: the hardware expects data in a certain bit format you can either define manually or through pre-defined library structures. It depends on the specific library whether they use bit fields or just bit manipulation.

In the "old days" computers games used bit fields frequently to make the most use of computer/disk memory as possible. For example, for a NPC definition in a RPG you might find (made up example):

struct charinfo_t
{
     unsigned int Strength : 7;  // 0-100
     unsigned int Agility : 7;  
     unsigned int Endurance: 7;  
     unsigned int Speed : 7;  
     unsigned int Charisma : 7;  
     unsigned int HitPoints : 10;    //0-1000
     unsigned int MaxHitPoints : 10;  
     //etc...
};

You don't see it so much in more modern games/software as the space savings has gotten proportionally worse as computers get more memory. Saving a 1MB of memory when your computer only has 16MB is a big deal but not so much when you have 4GB.

uesp
  • 6,194
  • 20
  • 15
  • 5
    Computers may have more RAM these days, but keeping memory usage lower can help to keep it within the CPU memory cache, which would increase performance. On the other hand, bitfields require more instructions to access them, which decreases performance. Which is more significant? – Craig McQueen Jan 02 '13 at 06:46
  • @Pharap: Yes, comparing the instructions needed to read/write bitfields vs plain `int` struct members. It is worth considering in code aiming for high execution speed more than memory savings. – Craig McQueen Dec 21 '15 at 03:41
  • As mentioned elsewhere, this use for hardware interfaces is highly non-portable and depends upon the implementation and architecture of the machine being used. It's often easier to just do this stuff manually. – underscore_d Jan 05 '16 at 13:03
  • Modern computer-games also use bitfields if they want to maximize performance for some types of operations. Mostly filters. That would be f.ex., if A can interact with B,C and E, but not D, F and G -- then all of those checks can be performed with one bit-wise operation. When A needs to be checked for interactions with E, their interaction bitfields are compared using an & operation. If there is a result, an interaction has occured. This also has the good side-effect that asymmetric interactions can be done. – Oskar Holmkratz Oct 31 '18 at 07:12
7

The primary purpose of bit-fields is to provide a way to save memory in massively instantiated aggregate data structures by achieving tighter packing of data.

The whole idea is to take advantage of situations where you have several fields in some struct type, which don't need the entire width (and range) of some standard data type. This provides you with the opportunity to pack several of such fields in one allocation unit, thus reducing the overall size of the struct type. And extreme example would be boolean fields, which can be represented by individual bits (with, say, 32 of them being packable into a single unsigned int allocation unit).

Obviously, this only makes sense in situation where the pros of the reduced memory consumption outweigh the cons of slower access to values stored in bit-fields. However, such situations arise quite often, which makes bit-fields an absolutely indispensable language feature. This should answer your question about the modern use of bit-fields: not only they are used, they are essentially mandatory in any practically meaningful code oriented on processing large amounts of homogeneous data (like large graphs, for one example), because their memory-saving benefits greatly outweigh any individual-access performance penalties.

In a way, bit-fields in their purpose are very similar to such things as "small" arithmetic types: signed/unsigned char, short, float. In the actual data-processing code one would not normally use any types smaller than int or double (with few exceptions). Arithmetic types like signed/unsigned char, short, float exist just to serve as "storage" types: as memory-saving compact members of struct types in situations where their range (or precision) is known to be sufficient. Bit-fields is just another step in the same direction, that trades a bit more performance for much greater memory-saving benefits.

So, that gives us a rather clear set of conditions under which it is worthwhile to employ bit-fields:

  1. Struct type contains multiple fields that can be packed into a smaller number of bits.
  2. The program instantiates a large number of objects of that struct type.

If the conditions are met, you declare all bit-packable fields contiguously (typically at the end of the struct type), assign them their appropriate bit-widths (and, usually, take some steps to ensure that the bit-widths are appropriate). In most cases it makes sense to play around with ordering of these fields to achieve the best packing and/or performance.


There's also a weird secondary use of bit-fields: using them for mapping bit groups in various externally-specified representations, like hardware registers, floating-point formats, file formats etc. This has never been intended as a proper use of bit-fields, even though for some unexplained reason this kind of bit-field abuse continues to pop-up in real-life code. Just don't do this.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • In my code I use bit-shift macros to pack small fields like that. This avoids the implementation-defined aspects of bit-fields. (Not sure whether OP was asking specifically about using C's bit-field syntax; or about the concept of packing smaller-than-a-byte fields in general) – M.M Oct 25 '16 at 00:22
3

One use for bit fields used to be to mirror hardware registers when writing embedded code. However, since the bit order is platform-dependent, they don't work if the hardware orders its bits different from the processor. That said, I can't think of a use for bit fields any more. You're better off implementing a bit manipulation library that can be ported across platforms.

sizzzzlerz
  • 4,277
  • 3
  • 27
  • 35
  • How should one best write such a library? One approach in C++ would be to define a property which would take an address and bit range and yield a modifiable integer lvalue, and then do something like "#define UART3_BAUD_RATE_MODE MOD_BITFIELD(UART3_CONTROL,12,4)". I would expect one could arrange things so that reads and writes would be inlined (rather than generating function calls) but I don't know how best to arrange for the Boolean-logical-assignment operators (|= etc.) to work efficiently and/or atomically. – supercat May 31 '11 at 20:36
2

In the 70s I used bit fields to control hardware on a trs80. The display/keyboard/cassette/disks were all memory mapped devices. Individual bits controlled various things.

  1. A bit controlled 32 column vs 64 column display.
  2. Bit 0 in that same memory cell was the cassette serial data in/out.

As I recall, the disk drive control had a number of them. There were 4 bytes in total. I think there was a 2 bit drive select. But it was a long time ago. It was kind of impressive back then in that there were at least two different c compilers for the platform.

The other observation is that bit fields really are platform specific. There is no expectation that a program with bit fields should port to another platform.

EvilTeach
  • 28,120
  • 21
  • 85
  • 141
2

Bit fields were used in the olden days to save program memory.

They degrade performance because registers can not work with them so they have to be converted to integers to do anything with them. They tend to lead to more complex code that is unportable and harder to understand (since you have to mask and unmask things all the time to actually use the values.)

Check out the source for http://www.nethack.org/ to see pre ansi c in all its bitfield glory!

nate c
  • 8,802
  • 2
  • 27
  • 28
  • 2
    I don't get this. Why would "you have to mask and unmask things all the time to actually use the values"? Isn't the point that the compiler handles all that for you? – underscore_d Sep 25 '18 at 13:11
1

In modern code, there's really only one reason to use bitfields: to control the space requirements of a bool or an enum type, within a struct/class. For instance (C++):

enum token_code { TK_a, TK_b, TK_c, ... /* less than 255 codes */ };
struct token {
    token_code code      : 8;
    bool number_unsigned : 1;
    bool is_keyword      : 1;
    /* etc */
};

IMO there's basically no reason not to use :1 bitfields for bool, as modern compilers will generate very efficient code for it. In C, though, make sure your bool typedef is either the C99 _Bool or failing that an unsigned int, because a signed 1-bit field can hold only the values 0 and -1 (unless you somehow have a non-twos-complement machine).

With enumeration types, always use a size that corresponds to the size of one of the primitive integer types (8/16/32/64 bits, on normal CPUs) to avoid inefficient code generation (repeated read-modify-write cycles, usually).

Using bitfields to line up a structure with some externally-defined data format (packet headers, memory-mapped I/O registers) is commonly suggested, but I actually consider it a bad practice, because C doesn't give you enough control over endianness, padding, and (for I/O regs) exactly what assembly sequences get emitted. Have a look at Ada's representation clauses sometime if you want to see how much C is missing in this area.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • 1
    Note: bitfields on `bool` and MSVC don't mix, for compatibility with MSVC you need to use some `unsigned`. – Matthieu M. Nov 22 '10 at 07:56
  • 1
    Yet another reason to avoid MSVC then :-( – zwol Nov 22 '10 at 17:37
  • 1
    @Matthieu M.: Hmm.... In my experience MSVC never had any problems with using any integer types with bit-fields. – AnT stands with Russia Oct 25 '16 at 00:37
  • 1
    That `enum` could hold `<=` 256 codes, not `less than 255`. For C++, nowadays we would just use the ability to specify the underlying type and a sized type from ``. Also, not only `bool`s and `enum`s: small integral values of known bounds could also benefit from being able to save space. However, the reason not to use bitfields is if the objects simply are not plentiful enough, or the other members are so aligned as to nullify any purpose of bitfields, that the extra verbosity added is not worthwhile; using bitfields usually isn't recommended as a default, & there's good reason for it – underscore_d Sep 25 '18 at 13:20
  • @underscore_d I wish C would adopt that underlying-type-of-a-bitfield feature. It is valuable in several other contexts, such as controlling your library ABI, for which plain C is still commonly used. – zwol Sep 25 '18 at 13:35
0

Boost.Thread uses bitfields in its shared_mutex, on Windows at least:

    struct state_data
    {
        unsigned shared_count:11,
        shared_waiting:11,
        exclusive:1,
        upgrade:1,
        exclusive_waiting:7,
        exclusive_waiting_blocked:1;
    };
Steve Townsend
  • 53,498
  • 9
  • 91
  • 140
  • 2
    Looks like this was done to pack the structure into exactly one 32-bit machine word, probably for atomicity. Boost folken know *exactly* what they're doing and don't tend to explain themselves in enough detail for people who don't, which unfortunately means copying Boost logic can very easily end in tears -- for instance, there's a reason `exclusive` and `upgrade` aren't `bool`, but do you know what it is? – zwol Nov 22 '10 at 00:30
  • The logic uses compare-and-swap to avoid kernel locks unless essential. I am sure that's why it's done this way. – Steve Townsend Nov 22 '10 at 01:59
  • 2
    I think @zwol's point was that the 1-bit wide parts of this are not typed as `bool`s because that could (unless `bool` is a `typedef` to `unsigned`, which is unlikely in any barely modern or space-saving code) result in beginning a new allocation unit due to the type boundary, thus defeating the point of trying to pack those bits together with other values. – underscore_d Sep 25 '18 at 13:22
  • 2
    @underscore_d Yes, that's what I had in mind for the "reason exclusive and upgrade aren't bool". (Gosh, I'd forgotten I wrote that comment.) – zwol Sep 25 '18 at 13:31
-2

An alternative to consider is to specify bit field structures with a dummy structure (never instantiated) where each byte represents a bit:

struct Bf_format
{
  char field1[5];
  char field2[9];
  char field3[18];
};

With this approach sizeof gives the width of the bit field, and offsetof give the offset of the bit field. At least in the case of GNU gcc, compiler optimization of bit-wise operations (with constant shifts and masks) seems to have gotten to rough parity with (base language) bit fields.

I have written a C++ header file (using this approach) which allows structures of bit fields to be defined and used in a performant, much more portable, much more flexible way: https://github.com/wkaras/C-plus-plus-library-bit-fields . So, unless you are stuck using C, I think there would rarely be a good reason to use the base language facility for bit fields.

WaltK
  • 724
  • 4
  • 13
  • `where each byte represents a bit`--that would create a huge waste in RAM, no?--using 8 bits to represent each 1 bit you actually need. – Gabriel Staples Nov 09 '20 at 23:32
  • 1
    No, you don't use instances of the structure to store the data, just to define the layout and size of the bit fields. The example Bf_format structure above defines 3 bit fields that fit in 4 8-bit bytes. – WaltK Nov 11 '20 at 00:21