Creating bitflag variables with large amounts of flags or how to create large bit-width numbers

Question

Lets say I have an enum with bitflag options larger than the amount of bits in a standard data type:

enum flag_t {
FLAG_1 = 0x1,
FLAG_2 = 0x2,
...
FLAG_130 = 0x400000000000000000000000000000000,
};

This is impossible for several reasons. Enums are max size of 128 bits (in C/gcc on my system from experimentation), single variables are also of max size 128 bits etc.

In C you can't perform bitwise operations on arrays, though in C++ I suppose you could overload bitwise operators to do the job with a loop.

Is there any way in C other than manually remembering which flags go where to have this work for large numbers?

Define `large numbers`. How would you say, check such a `large number` for containing a flag on the 192nd bit? — Shark, May 07 '13 at 14:23
How would you use such flags, that are larger than any supported integer type? — unwind, May 07 '13 at 14:23
This _might_ be an indication that using an enum is a suboptimal choice. I'm imagining some mighty big `switch()` statements. — Dan Pichelman, May 07 '13 at 14:26
@Shark: Bitwise operations same as most bitflags `if(FLAG_2&var)` @unwind: I suppose the question boils down to "Is it possible to create a larger integer type in C?" — J V, May 07 '13 at 14:32
@JV: yep, you can `union` a couple of ints into a `large number` and i suppose you can use that. However, I also suspect you'll have to write your own `testForFlag()` method to determine which of those internal ints to test for the flag. — Shark, May 07 '13 at 14:33
@Shark: wouldn't that mean you couldn't store multiple flags from different elements of the union or do you mean a struct? — J V, May 07 '13 at 14:34
@JV. I suppose you can create even 1337 bits numbers, but in case of C you need to provide functions to do all possible operation you want, like plus(+), minus(-), mul(*), div(/) as well as bitwise operators. So if you are not scared by code like bw_or(add(x,y), 0x0BADF00D) ... you can just do it — maverik, May 07 '13 at 14:36

Cyan · Accepted Answer · 2013-05-07T18:06:15.467

4

This is exactly what bit-fields are for.

In C, it's possible to define the following data layout :

struct flag_t 
{
     unsigned int flag1 : 1;
     unsigned int flag2 : 1;
     unsigned int flag3 : 1;
(...)
     unsigned int flag130 : 1;
(...)
     unsigned int flag1204 : 1;   // for fun
};

In this example, all flags occupy just one bit. An obvious advantage is the unlimited number of flags. Another great advantage is that you are no longer limited to single-bit flags, you could have some multi-value flags merged in the middle.

But most importantly, testing and attribution would be a bit different, and probably simplified, as far as unit operations are concerned : you no longer need to do any masking, just access the flag directly by naming it. And by the way, use the opportunity to give these flags more comprehensive names :)

edited May 07 '13 at 18:06

answered May 07 '13 at 18:00

Cyan

13,248
8
43
78

This is exactly what I was thinking of! (Wouldn't it be more efficient to use `uchar` instead of `uint`?) – J V May 07 '13 at 18:18
Well ignoring packing (which would eliminate this benefit in practice anyway) `sizeof` shows 4 flags with `unsigned char x:1;` takes 1 byte while 4 flags with `unsigned int x:1` takes 4 (As an int is 4 bytes long) **Edit:** and this seems very situational, a short placed after an int bitflag will be "Collapsed" into the int it seems but changing it to an int will produce the 8 bytes expected. – J V May 07 '13 at 18:33
Sure, the `char` type offers you the size of your `struct` precisely defined in bytes, but since under most circumstances the compiler will try to align your data in memory, it's unlikely it will be able to use the remaining bytes between 1 and 4. Except, if you explicitly pack this structure into another one. – Cyan May 07 '13 at 20:11

score 1 · Answer 2 · answered May 07 '13 at 16:44

Instead of trying to assign absurdly large numbers to an enum so you can have a hundreds-of-bits-wide bitfield, let the compiler assign a normal zero-based sequence of numbers to your flag names, and simulate a wide bitfield using an array of unsigned char. You can have a 1024-bit bitfield using unsigned char bits[128], and write get_flag() and set_flag() accessor functions to mask the minor amount of extra work involved.

However, a far better piece of advice would be to look at your design again, and ask yourself "Why do I need over a hundred different flags?". It seems to me that what you really need is a redesign.

Richard Chambers · Answer 3 · 2018-03-21T14:05:18.047

In this answer to a question related to bitflags, Bit Manipulation and Flags, I provided an example of using an unsigned char array that is an approach for very large sets of bitflags which I am moving to this posting.

This source example provides the following:

a set of Preprocessor defines for the bitflag values
a set of Preprocessor macros to manipulate bits
a couple of functions to implement bitwise operations on the arrays

The general approach for this is as follows:

create a set of defines for the flags which specify an array offset and a bit pattern
create a typedef for an unsigned char array of the proper size
create a set of functions that implement the bitwise logical operations

The Specifics from the Answer with a Few Improvements and More Exposition

Use a set of C Preprocessor defines to create a set of bitflags to be used with the array. These bitflag defines specify an offset within the unsigned char array along with the bit to manipulate.

The defines in this example are 16 bit values in which the upper byte contains the array offset and the lower byte contains the bit flag(s) for the byte of the unsigned char array whose offset is in the upper byte. Using this technique you can have arrays up to 256 elements, 256 * 8 or 2,048 bitflags, or by going from a 16 bit define to a 32 bit long you could have much more. (In the comments below bit 0 means least significant bit of a byte and bit 7 means most significant bite of a byte).

#define ITEM_FLG_01  0x0001     // array offset 0, bit 0
#define ITEM_FLG_02  0x0002     // array offset 0, bit 1
#define ITEM_FLG_03  0x0101     // array offset 1, bit 0
#define ITEM_FLG_04  0x0102     // array offset 1, bit 1
#define ITEM_FLG_05  0x0201     // array offset 2, bit 0
#define ITEM_FLG_06  0x0202     // array offset 2, bit 1
#define ITEM_FLG_07  0x0301     // array offset 3, bit 0
#define ITEM_FLG_08  0x0302     // array offset 3, bit 1
#define ITEM_FLG_10  0x0908     // array offset 9, bit 7

Next you have a set of macros to set and unset the bits along with a typedef to make it a bit easier to use. Unfortunately using a typedef with C does not provide you better type checking from the compiler but it does make it easier to use. These macros do no checking of their arguments so you might feel safer using regular functions instead.

#define SET_BIT(p,b) (*((p) + (((b) >> 8) & 0xf)) |= (b) & 0xf)
#define TOG_BIT(p,b) (*((p) + (((b) >> 8) & 0xf)) ^= (b) & 0xf)
#define CLR_BIT(p,b) (*((p) + (((b) >> 8) & 0xf)) &= ~ ((b) & 0xf))
#define TST_BIT(p,b) (*((p) + (((b) >> 8) & 0xf)) & ((b) & 0xf))

typedef unsigned char BitSet[10];

An example of using this basic framework is as follows.

BitSet uchR = { 0 };
int  bValue;

SET_BIT(uchR, ITEM_FLG_01);
bValue = TST_BIT(uchR, ITEM_FLG_01);
SET_BIT(uchR, ITEM_FLG_03);
TOG_BIT(uchR, ITEM_FLG_03);
TOG_BIT(uchR, ITEM_FLG_04);
CLR_BIT(uchR, ITEM_FLG_05);
CLR_BIT(uchR, ITEM_FLG_01);

Next you can introduce a set of utility functions to do some of the bitwise operations we want to support. These bitwise operations would be analogous to the built in C operators such as bitwise Or (|) or bitwise And (&). These functions use the built in C operators to perform the designated operator on all array elements.

These particular examples of the utility functions modify one of the sets of bitflags provided. However if that is a problem, you can modify the functions to accept three arguments, one being for the result of the operation and the other two for the two sets of bitflags to use in the operation.

void AndBits(BitSet s1, const BitSet s2)
{
    size_t nLen = sizeof(BitSet);

    for (; nLen > 0; nLen--) {
        *s1++ &= *s2++;
    }
}

void OrBits(BitSet s1, const BitSet s2)
{
    size_t nLen = sizeof(BitSet);

    for (; nLen > 0; nLen--) {
        *s1++ |= *s2++;
    }
}

void XorBits(BitSet s1, const BitSet s2)
{
    size_t nLen = sizeof(BitSet);

    for (; nLen > 0; nLen--) {
        *s1++ ^= *s2++;
    }
}

If you need more than one size of a bitflags type using this approach then the most flexible approach to eliminate the typedef and just use straight unsigned char arrays of various sizes. This change would entail modifying the interface of the utility functions replacing BitSet with unsigned char pointer and unsigned char arrays where bitflag variables are defined. Along with the unsigned char pointers, you would also need to specify a length for the arrays.

You may also consider an approach similar to what is being done for text strings in Is concatenating arbitrary number of strings with nested function calls in C undefined behavior?.

Creating bitflag variables with large amounts of flags or how to create large bit-width numbers

3 Answers3

Linked