21

What are pros/cons of usage bitsets over enum flags?

namespace Flag {
    enum State {
        Read   = 1 << 0,
        Write  = 1 << 1,
        Binary = 1 << 2,
    };
}

namespace Plain {
    enum State {
        Read,
        Write,
        Binary,
        Count
    };
}

int main()
{
    {
        unsigned int state = Flag::Read | Flag::Binary;
        std::cout << state << std::endl;

        state |= Flag::Write;
        state &= ~(Flag::Read | Flag::Binary);
        std::cout << state << std::endl;
    } {
        std::bitset<Plain::Count> state;
        state.set(Plain::Read);
        state.set(Plain::Binary);
        std::cout << state.to_ulong() << std::endl;

        state.flip();
        std::cout << state.to_ulong() << std::endl;
    }

    return 0;
}

As I can see so far, bitsets have more convinient set/clear/flip functions to deal with, but enum-flags usage is a more wide-spreaded approach.

What are possible downsides of bitsets and what and when should I use in my daily code?

Nikolai Shalakin
  • 1,349
  • 2
  • 13
  • 25
  • Since the flags are pre-calculated, they have an obvious advantages in your test. – StoryTeller - Unslander Monica Aug 07 '17 at 07:56
  • 2
    I would say that it all depends. It depends on use-cases, on personal preferences, on project requirements, on code style guides used, and more. If it's for your own projects, then go with whatever you feel best. My recommendation though is that you consider things like readability and maintainability and correctness first before performance. "Good enough" often *is* good enough. – Some programmer dude Aug 07 '17 at 07:56
  • 1
    would bitset work with constexpr? you may get same timing there. but in general bitset is slower because of its platform-agnostic nature. – Swift - Friday Pie Aug 07 '17 at 07:58
  • 2
    `bitsets are significally slower (~24 times on my machine) than bare bit operations` I have another results where [bitsets](https://github.com/Gluttton/PslRK/blob/master/Solutions/07/main.cpp) are nearly fast as [asm](https://github.com/Gluttton/PslRK/blob/master/Solutions/03/main.c) code. – Gluttton Aug 07 '17 at 08:21
  • First: the two examples are *not* equivalent! You would have to set read and binary flags after flip explicitly to truely get equivalence. So actually, the bitset variant produces longer code (by four lines)... Sure, not always shorter code is better to read. For me, as I am quite used to the bare bit operations, it is as easy to read as the bitset variant, and with that, I'd prefer the former one, but that is a very *personal* matter... – Aconcagua Aug 07 '17 at 08:43
  • This matter can change much, though, if you have to follow coding guide lines such as MISRA - many of them impose additional casts on you during these bit operations reducing readability again - sometimes, they can be avoided by using unsigned int as base type for the enum (`enum class E : unsigned int`). – Aconcagua Aug 07 '17 at 08:44
  • I very much like the bitset but as other commenters say there is a performance penalty to pay if you have to do conversions from position to bit pattern and vice versa. – gast128 Aug 07 '17 at 08:57
  • Post your benchmark code and its build script. – n. m. could be an AI Aug 07 '17 at 09:54
  • Some compilers will require you to define explicit operators for enums. – Michaël Roy Aug 07 '17 at 10:23
  • @TobySpeight this question was not specifically about performance. It was more about caveats of bitsets/enum-flags usage. Performance was mentioned just for mention. – Nikolai Shalakin Aug 07 '17 at 12:33
  • @Nikolai - it appeared to be about performance; your edit has made your intent clearer, and I retracted my close vote. – Toby Speight Aug 07 '17 at 13:35

4 Answers4

13

Both std::bitset and c-style enum have important downsides for managing flags. First, let's consider the following example code :

namespace Flag {
    enum State {
        Read   = 1 << 0,
        Write  = 1 << 1,
        Binary = 1 << 2,
    };
}

namespace Plain {
    enum State {
        Read,
        Write,
        Binary,
        Count
    };
}

void f(int);
void g(int);
void g(Flag::State);
void h(std::bitset<sizeof(Flag::State)>);

namespace system1 {
    Flag::State getFlags();
}
namespace system2 {
    Plain::State getFlags();
}

int main()
{
    f(Flag::Read);  // Flag::Read is implicitly converted to `int`, losing type safety
    f(Plain::Read); // Plain::Read is also implicitly converted to `int`

    auto state = Flag::Read | Flag::Write; // type is not `Flag::State` as one could expect, it is `int` instead
    g(state); // This function calls the `int` overload rather than the `Flag::State` overload

    auto system1State = system1::getFlags();
    auto system2State = system2::getFlags();
    if (system1State == system2State) {} // Compiles properly, but semantics are broken, `Flag::State`

    std::bitset<sizeof(Flag::State)> flagSet; // Notice that the type of bitset only indicates the amount of bits, there's no type safety here either
    std::bitset<sizeof(Plain::State)> plainSet;
    // f(flagSet); bitset doesn't implicitly convert to `int`, so this wouldn't compile which is slightly better than c-style `enum`

    flagSet.set(Flag::Read);    // No type safety, which means that bitset
    flagSet.reset(Plain::Read); // is willing to accept values from any enumeration

    h(flagSet);  // Both kinds of sets can be
    h(plainSet); // passed to the same function
}

Even though you may think those problems are easy to spot on simple examples, they end up creeping up in every code base that builds flags on top of c-style enum and std::bitset.

So what can you do for better type safety? First, C++11's scoped enumeration is an improvement for type safety. But it hinders convenience a lot. Part of the solution is to use template-generated bitwise operators for scoped enums. Here is a great blog post which explains how it works and also provides working code : https://www.justsoftwaresolutions.co.uk/cplusplus/using-enum-classes-as-bitfields.html

Now let's see what this would look like :

enum class FlagState {
    Read   = 1 << 0,
    Write  = 1 << 1,
    Binary = 1 << 2,
};
template<>
struct enable_bitmask_operators<FlagState>{
    static const bool enable=true;
};

enum class PlainState {
    Read,
    Write,
    Binary,
    Count
};

void f(int);
void g(int);
void g(FlagState);
FlagState h();

namespace system1 {
    FlagState getFlags();
}
namespace system2 {
    PlainState getFlags();
}

int main()
{
    f(FlagState::Read);  // Compile error, FlagState is not an `int`
    f(PlainState::Read); // Compile error, PlainState is not an `int`

    auto state = FlagState::Read | FlagState::Write; // type is `FlagState` as one could expect
    g(state); // This function calls the `FlagState` overload

    auto system1State = system1::getFlags();
    auto system2State = system2::getFlags();
    if (system1State == system2State) {} // Compile error, there is no `operator==(FlagState, PlainState)`

    auto someFlag = h();
    if (someFlag == FlagState::Read) {} // This compiles fine, but this is another type of recurring bug
}

The last line of this example shows one problem that still cannot be caught at compile time. In some cases, comparing for equality may be what's really desired. But most of the time, what is really meant is if ((someFlag & FlagState::Read) == FlagState::Read).

In order to solve this problem, we must differentiate the type of an enumerator from the type of a bitmask. Here's an article which details an improvement on the partial solution I referred to earlier : https://dalzhim.github.io/2017/08/11/Improving-the-enum-class-bitmask/ Disclaimer : I'm the author of this later article.

When using the template-generated bitwise operators from the last article, you will get all of the benefits we demonstrated in the last piece of code, while also catching the mask == enumerator bug.

Ad N
  • 7,930
  • 6
  • 36
  • 80
Dalzhim
  • 1,970
  • 1
  • 17
  • 34
3

Some observations:

  • std::bitset< N > supports an arbitrary number of bits (e.g., more than 64 bits), whereas underlying integral types of enums are restricted to 64 bits;
  • std::bitset< N > can implicitly (depending on the std implementation) use the underlying integral type with the minimal size fitting the requested number of bits, whereas underlying integral types for enums need to be explicitly declared (otherwise, int will be used as the default underlying integral type);
  • std::bitset< N > represents a generic sequence of N bits, whereas scoped enums provide type safety that can be exploited for method overloading;
  • If std::bitset< N > is used as a bit mask, a typical implementation depends on an additional enum type for indexing (!= masking) purposes;

Note that the latter two observations can be combined to define a strong std::bitset type for convenience:

typename< Enum E, std::size_t N >
class BitSet : public std::bitset< N >
{
    ...

    [[nodiscard]]
    constexpr bool operator[](E pos) const;

    ...
};

and if the code supports some reflection to obtain the number of explicit enum values, then the number of bits can be deduced directly from the enum type.

  • scoped enum types do not have bitwise operator overloads (which can easily be defined once using SFINAE or concepts for all scoped and unscoped enum types, but need to be included before use) and unsoped enum types will decay to the underlying integral type;
  • bitwise operator overloads for enum types, require less boilerplate than std::bitset< N > (e.g., auto flags = Depth | Stencil;);
  • enum types support both signed and unsigned underlying integral types, whereas std::bitset< N > internally uses unsigned integral types (shift operators).

FWIIW, in my own code I mostly use std::bitset (and eastl::bitvector) as private bit/bool containers for setting/getting single bits/bools. For masking operations, I prefer scoped enum types with explicitly defined underlying types and bitwise operator overloads.

Matthias
  • 4,481
  • 12
  • 45
  • 84
2

Do you compile with optimization on? It is very unlikely that there is a 24x speed factor.

To me, bitset is superior, because it manages space for you:

  • can be extended as much as wanted. If you have a lot of flags, you may run out of space in the int/long long version.
  • may take less space, if you only use just several flags (it can fit in an unsigned char/unsigned short - I'm not sure that implementations apply this optimization, though)
geza
  • 28,403
  • 6
  • 61
  • 135
1

(Ad mode on) You can get both: a convenient interface and max performance. And type-safety as well. https://github.com/oliora/bitmask

oliora
  • 839
  • 5
  • 12
  • A different option here: https://codereview.stackexchange.com/questions/183246/type-safe-flag-sets-bit-fields-that-make-sense – Cris Luengo Jan 05 '18 at 22:13