0

Here's a little puzzle I couldn't find a good answer for:

Given a struct with bitfields, such as

struct A {
    unsigned foo:13;
    unsigned bar:19;
};

Is there a (portable) way in C++ to get the correct mask for one of the bitfields, preferably as a compile-time constant function or template?

Something like this:

constinit unsigned mask = getmask<A::bar>();  // mask should be 0xFFFFE000

In theory, at runtime, I could crudely do:

unsigned getmask_bar() {
    union AA {
        unsigned mask;
        A fields;
    } aa{};
    aa.fields.bar -= 1;
    return aa.mask;
}

That could even be wrapped in a macro (yuck!) to make it "generic".

But I guess you can readily see the various deficiencies of this method.

Is there a nicer, generic C++ way of doing it? Or even a not-so-nice way? Is there something useful coming up for the next C++ standard(s)? Reflection?

Edit: Let me add that I am trying to find a way of making bitfield manipulation more flexible, so that it is up to the programmer to modify multiple fields at the same time using masking. I am after terse notation, so that things can be expressed concisely without lots of boilerplate. Think working with hardware registers in I/O drivers as a use case.

sh-
  • 941
  • 6
  • 13
  • 1
    C++ doesn't allow type-punning through unions, only the last union member you've written to can be read from. – Some programmer dude May 06 '22 at 16:57
  • Also, IIRC, if two or more members of a bit-field can be put into a single base integer type (like `unsigned int` for your case) the compiler can put them in any order in that integer. In this case there's no guarantee that `foo` will be "first" in the bit-field. You also have to consider endianness issues of your target system. – Some programmer dude May 06 '22 at 17:00
  • @Some programmer dude: I know about all that. Those are amongst the various deficiencies :-). – sh- May 06 '22 at 17:18
  • Related, but off-topic for this question, you might like this bitfield visualization approach: https://stackoverflow.com/a/69227655/4641116 – Eljay May 06 '22 at 17:34

1 Answers1

1

Unfortunately, there is no better way - in fact, there is no way to extract individual adjacent bit fields from a struct by inspecting its memory directly in C++.

From Cppreference:

The following properties of bit-fields are implementation-defined:

  • The value that results from assigning or initializing a signed bit-field with a value out of range, or from incrementing a signed bit-field past its range.

  • Everything about the actual allocation details of bit-fields within the class object

    • For example, on some platforms, bit-fields don't straddle bytes, on others they do
    • Also, on some platforms, bit-fields are packed left-to-right, on others right-to-left

Your compiler might give you stronger guarantees; however, if you do rely on the behavior of a specific compiler, you can't expect your code to work with a different compiler/architecture pair. GCC doesn't even document their bit field packing, as far as I can tell, and it differs from one architecture to the next. So your code might work on a specific version of GCC on x86-64 but break on literally everything else, including other versions of the same compiler.

If you really want to be able to extract bitfields from a random structure in a generic way, your best bet is to pass a function pointer around (instead of a mask); that way, the function can access the field in a safe way and return the value to its caller (or set a value instead).

Something like this:

template<typename T>
auto extractThatBitField(const void *ptr) {
  return static_cast<const T *>(ptr)->m_thatBitField;
}

auto *extractor1 = &extractThatBitField<Type1>;
auto *extractor2 = &extractThatBitField<Type2>;
/* ... */

Now, if you have a pair of {pointer, extractor}, you can get the value of the bitfield safely. (Of course, the extractor function has to match the type of the object behind that pointer.) It's not much overhead compared to having a {pointer, mask} pair instead; the function pointer is maybe 4 bytes larger than the mask on a 64-bit machine (if at all). The extractor function itself will just be a memory load, some bit twiddling, and a return instruction. It'll still be super fast.

This is portable and supported by the C++ standard, unlike inspecting the bits of a bitfield directly.

Alternatively, C++ allows casting between standard-layout structs that have common initial members. (Though keep in mind that this falls apart as soon as inheritance or private/protected members get involved! The first solution, above, works for all those cases as well.)

struct Common {
  int m_a : 13;
  int m_b : 19;
  int : 0; //Needed to ensure the bit fields end on a byte boundary
};

struct Type1 {
  int m_a : 13;
  int m_b : 19;
  int : 0;
  
  Whatever m_whatever;
};

struct Type2 {
  int m_a : 13;
  int m_b : 19;
  int : 0;
  
  Something m_something;
};

int getFieldA(const void *ptr) {
  //We still can't do type punning directly due
  //to weirdness in various compilers' aliasing resolution.
  //std::memcpy is the official way to do type punning.
  //This won't compile to an actual memcpy call.
  Common tmp;
  std::memcpy(&tmp, ptr, sizeof(Common));
  return tmp.m_a;
}

See also: Can memcpy be used for type punning?

Jonathan S.
  • 1,796
  • 5
  • 14
  • Thanks for this sugestion! I don't intend to use signed bitfields for the reason you quoted, and the left-to-right vs. right-to-left issue doesn't really matter all that much when the resulting mask matches the allocation strategy of the compiler. – sh- May 06 '22 at 17:37
  • I've added another strategy that might fit your needs and isn't undefined behavior. – Jonathan S. May 06 '22 at 18:33
  • Interesting! How do I know that an actual memcpy call won't result? Is there any ruling in the standard, or is this just the optimizer doing its usual job? – sh- May 06 '22 at 19:46
  • That's just the optimizer doing its job. Both GCC and Clang recognize this pattern and don't emit a call to memcpy even without optimizations turned on. MSVC omits the call to memcpy at /O2 and higher. – Jonathan S. May 06 '22 at 19:57
  • So when we are prepared to rely on typical behavior of compilers, wouldn't it be similarly reasonable to expect type punning through a union to work dependably, as long as the union encompasses just a single word of the underlying architecture? IOW avoiding the trouble that comes from crossing a word boundary could bring us into the realm of straightforward behavior? – sh- May 06 '22 at 20:41
  • No, unfortunately not. The compiler is free to assume that pointers to incompatible types never alias. This also means that the compiler assumes that pointers/references to different, overlapping members of a union *don't alias*, which in practice means you can't access inactive members of a union (unless they share common initial sequences). The compiler will therefore see that the store to that one member of the union is dead, since the member is never read from, and optimize away the store entirely - unless it specifically recognizes broken code and fixes it for you. – Jonathan S. May 06 '22 at 20:52