How does the compiler implement bit field arithmetics?

Question

When asking a question on how to do wrapped N bit signed subtraction I got the following answer:

template<int bits>
int
sub_wrap( int v, int s )
{
    struct Bits { signed int r: bits; } tmp;
    tmp.r = v - s;
    return tmp.r;
}

That's neat and all, but how will a compiler implement this? From this question I gather that accessing bit fields is more or less the same as doing it by hand, but what about when combined with arithmetic as in this example? Would it be as fast as a good manual bit-twiddling approach?

An answer for "gcc" in the role of "a compiler" would be great if anyone wants to get specific. I've tried reading the generated assembly, but it is currently beyond me.

For what it's worth, this is so much more unreadable than the hand-coded bit twiddling that I'd avoid using it even if it gave a performance boost (unless of course, this single piece bit-twiddling code was so critical that it's the sole bottleneck to performance). — André Caron, Nov 29 '11 at 15:47
I think this actually invokes undefined behaviour if the value of `v-s` cannot be stored in `tmp.r` (which means that the bit-wrapping effect of this is not guaranteed anyway). — celtschk, Nov 29 '11 at 15:52
The only way to know is to read Assembly code generated by compiler. — Alex F, Nov 29 '11 at 16:07
@AndréCaron I don't quite understand the "unreadable" part. This variant states clearly and precisely that we want a result over `bits` bits. The hand-coded twiddling counts on the recognition of certain programming patterns; patterns that most programmers probably aren't that familiar with. As such, it requires significant comment if used in production code (where as this version can go in without comment). — James Kanze, Nov 30 '11 at 14:34
@JamesKanze: This version requires being familiar with this rarely encountered syntax, which is also a programming pattern most programmers aren't that familiar which, and I couldn't tell which is lesser known. However, `(v-s) & ((1< — André Caron, Nov 30 '11 at 16:25
@AndréCaron Both bit fields and bitwise operators are "rare" in most domains. The difference is that the language specification for bit fields says exactly what they do: create a field with the specified number of bits. Where as it takes some experience with bitwise operators to understand what is actually being achieved. "Explicit is better": the bit field version is explicit; the bitwise operators require comments. — James Kanze, Nov 30 '11 at 19:07
@JamesKanze: Even if on actually writing the bit field the value is truncated, I'd fear that the function would be inlined, the intermediate bitfield would then be optimized out, and at the same time also the truncation. And possibly that will happen three versions of the compiler in the future when everyone has forgotten about that bit of code. After all, when gcc implemented strict aliasing, also a lot of code making "obvious" assumptions suddenly broke. — celtschk, Nov 30 '11 at 21:55
@celtschk Inlining and optimization are not allowed to change the semantics of the code, except in a few very limited ways (suppression of side effects in the copy constructor, change in precision of temporary floating point, etc.) If this happens, there is a bug in the compiler. (The strict aliasing issues are different---the standard is far from clear, and at times contradictory with regards to what is or is not guaranteed.) — James Kanze, Dec 01 '11 at 09:55
@JamesKanze: I thought we were talking about undefined behaviour which happens to work on 2's complement machines. Of course the compiler may change undefined bahviour at any time for any reason. — celtschk, Dec 01 '11 at 20:03
@celtschk The undefined behavior occurs when assigning a value which doesn't fit to a signed integral type. In practice, of course, every compiler just masks and shifts, as you'd do by hand, so you get the expected value on a 2's complement machine. After that, the read is perfectly valid. And the other solutions for signed values are extremely complex, and also involve formally undefined behavior at some point. — James Kanze, Dec 02 '11 at 11:09
@JamesKanze: The effect of undefined behaviour is not confined to the operation which causes it. If the compiler figures out a certain value of a variable would cause undefined behaviour, it is allowed to assume that the variable does not have that value, and therefore may well optimize away anything which is only needed for values causing undefined behaviour. — celtschk, Dec 02 '11 at 20:08
@celtschk True, but in practice, compilers do consider signed overflow defined behavior. There's just too much code which depends on it to do otherwise. (Personally, it's a situation I deplore; I'd rather see some sort of runtime error. But it's not going to happen.) — James Kanze, Dec 05 '11 at 09:03

Sebastian Mach · Answer 1 · 2011-11-29T16:12:40.503

The compiler has knowledge about the size and exact position of r in your example. Suppose it is like

[xxxxrrrr]

Then

tmp.r = X;

could e.g. be expanded to (the b-suffix indicating binary literals, & is bitwise and, | is bitwise or)

tmp = (tmp & 11110000b)   // <-- get the remainder which is not tmp.r
    | (X   & 00001111b);  // <-- put X into tmp.r and filter away unwanted bits

Imagine your layout is

[xxrrrrxx]   // 4 bits, 2 left-shifts

the expansion could be

tmp = (tmp    & 11000011b)   // <-- get the remainder which is not tmp.r
    | ((X<<2) & 00111100b);  // <-- filter 4 relevant bits, then shift left 2

How X actually looks like, whether a complex formulation or just a literal, is actually irrelevant.

If your architecture does not support such bitwise operations, there are still multiplications and divisions by power of two to simulate shifting, and probably these can also be used to filter out unwanted bits.

score 2 · Accepted Answer · answered Nov 30 '11 at 10:03

As written in the other question, unsigned wrapping math can be done as:

int tmp = (a - b) & 0xFFF;  /* 12 bit mask.  */

Writing to a (12bit) bitfield will do exactly that, signed or unsigned. The only difference is that you might get a warning message from the compiler.

For reading though, you need to do something a bit different.

For unsigned maths, it's enough to do this:

int result = tmp;  /* whatever bit count, we know tmp contains nothing else.  */

or

int result = tmp & 0xFFF;  /* 12bit, again, if we have other junk in tmp.  */

For signed maths, the extra magic is the sign-extend:

int result = (tmp << (32-12)) >> (32-12); /* asssuming 32bit int, and 12bit value. */

All that does is replicate the top bit of the bitfield (bit 11) across the wider int.

This is exactly what the compiler does for bitfields. Whether you code them by hand or as bitfields is up to you, but just make sure you get the magic numbers right.

(I have not read the standard, but I suspect that relying on bitfields to do the right thing on overflow might not be safe?)

How does the compiler implement bit field arithmetics?

2 Answers2

Linked