C macro to create a bit mask -- possible? And have I found a GCC bug?

Question

I am somewhat curious about creating a macro to generate a bit mask for a device register, up to 64bits. Such that BIT_MASK(31) produces 0xffffffff.

However, several C examples do not work as thought, as I get 0x7fffffff instead. It is as-if the compiler is assuming I want signed output, not unsigned. So I tried 32, and noticed that the value wraps back around to 0. This is because of C standards stating that if the shift value is greater than or equal to the number of bits in the operand to be shifted, then the result is undefined. That makes sense.

But, given the following program, bits2.c:

#include <stdio.h>

#define BIT_MASK(foo) ((unsigned int)(1 << foo) - 1)

int main()
{
    unsigned int foo;
    char *s = "32";

    foo = atoi(s);
    printf("%d %.8x\n", foo, BIT_MASK(foo));

    foo = 32;
    printf("%d %.8x\n", foo, BIT_MASK(foo));

    return (0);
}

If I compile with gcc -O2 bits2.c -o bits2, and run it on a Linux/x86_64 machine, I get the following:

32 00000000
32 ffffffff

If I take the same code and compile it on a Linux/MIPS (big-endian) machine, I get this:

32 00000000
32 00000000

On the x86_64 machine, if I use gcc -O0 bits2.c -o bits2, then I get:

32 00000000
32 00000000

If I tweak BIT_MASK to ((unsigned int)(1UL << foo) - 1), then the output is 32 00000000 for both forms, regardless of gcc's optimization level.

So it appears that on x86_64, gcc is optimizing something incorrectly OR the undefined nature of left-shifting 32 bits on a 32-bit number is being determined by the hardware of each platform.

Given all of the above, is it possible to programatically create a C macro that creates a bit mask from either a single bit or a range of bits?

I.e.:

BIT_MASK(6) = 0x40
BIT_FIELD_MASK(8, 12) = 0x1f00

Assume BIT_MASK and BIT_FIELD_MASK operate from a 0-index (0-31). BIT_FIELD_MASK is to create a mask from a bit range, i.e., 8:12.

Shouldn't your unsigned int be outside? `((unsigned int)((1 << foo) - 1))`? — fge, Jan 08 '12 at 01:17
@Kerrek: That works if 32-bits, but how do you handle `BIT_MASK(64)`? — Kumba, Jan 08 '12 at 01:18
@Mimisbrunnr: This is toy code I worked up to test bit masking out. Ultimately, I'm fiddling with a network driver that uses registers aligned to 64-bit boundaries. Given a 64-bit (or 32-bit) register, if you want to access a specific bit in the register, you do `reg & mask`, but I was trying to find a macro that would generate the mask given the bits, instead of writing the mask value out by hand. — Kumba, Jan 08 '12 at 01:21
@fge: Beats me, I doubt the placement really affects things. I'm more interested in seeing if it's possible to create such a macro AND if I have stumbled across a gcc bug or not. — Kumba, Jan 08 '12 at 01:22
@Kumba the word "aligned" doesn't apply to registers, only to memory — Seth Carnegie, Jan 08 '12 at 01:23
You can't reliably do BIT_MASK(64) using your algorithm if the size of `unsigned long long` is 64 bits. — Jonathan Leffler, Jan 08 '12 at 01:23
@Seth: I am quoting the source documentation. It's 10+ years old, too, so who knows what they were thinking back then when they wrote it. — Kumba, Jan 08 '12 at 01:28
Maybe `~` will be useful. It flips all the bits. `long long unsigned mask_64 = ~ 0ULL;` — Aaron McDaid, Jan 08 '12 at 01:31
I'm curious: your expression produces 0 for BIT_MASK(0); is that what you need? — Jonathan Leffler, Jan 08 '12 at 19:16

Dietrich Epp · Accepted Answer · 2012-01-08T01:43:36.447

13

Here is a version of the macro which will work for arbitrary positive inputs. (Negative inputs still invoke undefined behavior...)

#include <limits.h>
/* A mask with x least-significant bits set, possibly 0 or >=32 */
#define BIT_MASK(x) \
    (((x) >= sizeof(unsigned) * CHAR_BIT) ?
        (unsigned) -1 : (1U << (x)) - 1)

Of course, this is a somewhat dangerous macro as it evaluates its argument twice. This is a good opportunity to use a static inline if you use GCC or target C99 in general.

static inline unsigned bit_mask(int x)
{
    return (x >= sizeof(unsigned) * CHAR_BIT) ?
        (unsigned) -1 : (1U << x) - 1;
}

As Mysticial noted, shifting more than 32 bits with a 32-bit integer results in ~~implementation-defined~~ undefined behavior. Here are three different implementations of shifting:

On x86, only examine the low 5 bits of the shift amount, so x << 32 == x.
On PowerPC, only examine the low 6 bits of the shift amount, so x << 32 == 0 but x << 64 == x.
On Cell SPUs, examine all bits, so x << y == 0 for all y >= 32.

However, compilers are free to do whatever they want if you shift a 32-bit operand 32 bits or more, and they are even free to behave inconsistently (or make demons fly out your nose).

Implementing BIT_FIELD_MASK:

This will set bit a through bit b (inclusive), as long as 0 <= a <= 31 and 0 <= b <= 31.

#define BIT_MASK(a, b) (((unsigned) -1 >> (31 - (b))) & ~((1U << (a)) - 1))

edited Jan 08 '12 at 01:43

answered Jan 08 '12 at 01:22

Dietrich Epp

205,541
37
345
415

What's the definition of CHAR_BIT? – Kumba Jan 08 '12 at 01:32
+1 for info on the other architectures. I didn't know about that behavior for PowerPC. – Mysticial Jan 08 '12 at 01:34
The difference between standard PowerPC and Cell SPU is interesting, because aren't the SPUs essentially derived from PPC? – Kumba Jan 08 '12 at 01:38
@Kumba: `CHAR_BIT` is in ``. – Dietrich Epp Jan 08 '12 at 01:38
1

@Kumba: The SPUs are wildly different beasts. They do not have scalar registers -- only vector registers and special registers. – Dietrich Epp Jan 08 '12 at 01:39
@Dietrich: `BIT_FIELD_MASK` is easily extendible to 64-bits. Though, now I am wondering if it can be made type-insensitive. The driver I am working on is for the Linux kernel. Any issues if I use some form of your code in the driver? Assuming the upstream reviewers don't shred it. I don't think the kernel has a macro doing what I am trying to do, at least not in the expected place, `include/linux/bitops.h`. – Kumba Jan 08 '12 at 01:56
@Kumba: I don't consider the code in my answer to by copyrightable, it's too short. There are only so many ways to write such a macro. Go ahead and use it. – Dietrich Epp Jan 08 '12 at 03:13
@Dietrich: In this day and age, it is always hard to tell. It's the right thing to do in any event :) Thanks! – Kumba Jan 08 '12 at 04:27
Note that for the `BIT_MASK` macro, `b` must be greater or equal to `a` to work correctly. – gbmhunter Feb 12 '16 at 02:18
@gbmhunter: Hm, if you specify an invalid range, I can't imagine that you'd expect it to work "correctly" under any circumstances. – Dietrich Epp Feb 12 '16 at 02:46
@DietrichEpp, it's just that if you read the "Implementing BIT_FIELD_MASK:" section, nowhere does it say that `a` is the lower bound and `b` is the upper bound, which caught me out the first time I tried using it. – gbmhunter Feb 12 '16 at 04:27
@gbmhunter: That's what "a through b" means. It means that a <= b. Like "A to Z", "9 to 5", or "bits 5-8". You wouldn't say "Z to A", and if you wrote a loop you would write it `for (i = 0; i < N; i++)`, not `for (i = N; i < 0; i++)`. It's also literally one line of code with no tricks in it. – Dietrich Epp Feb 12 '16 at 07:06

score 8 · Answer 2 · edited May 23 '17 at 12:17

8

Shifting by more than or equal to the size of the integer type is undefined behavior.
So no, it's not a GCC bug.

In this case, the literal 1 is of type int which is 32-bits in both systems that you used. So shifting by 32 will invoke this undefined behavior.

In the first case, the compiler is not able to resolve the shift-amount to 32. So it likely just issues the normal shift-instruction. (which in x86 uses only the bottom 5-bits) So you get:

(unsigned int)(1 << 0) - 1

which is zero.

In the second case, GCC is able to resolve the shift-amount to 32. Since it is undefined behavior, it (apparently) just replaces the entire result with 0:

(unsigned int)(0) - 1

so you get ffffffff.

So this is a case of where GCC is using undefined behavior as an opportunity to optimize.
(Though personally, I'd prefer that it emits a warning instead.)

Related: Why does integer overflow on x86 with GCC cause an infinite loop?

edited May 23 '17 at 12:17

Community

1
1

answered Jan 08 '12 at 01:15

Mysticial

464,885
45
335
332

2

Should be `more than or equal`, not `more than`. – Alexey Frunze Jan 08 '12 at 01:18
@Mystical: What would explain why gcc gets part of it wrong on one architecture, but not another? Same gcc versions, but each machine is running a different libc. Using `atoi` or `strtoul` doesn't affect either. – Kumba Jan 08 '12 at 01:24
@Kumba I've edited my answer with more details for the x86/64 case. I'm not familiar with MIPS though. – Mysticial Jan 08 '12 at 01:26
@Mystical: Dietrich's answer includes some additional bits about arch-specific behavior, which is interesting. – Kumba Jan 08 '12 at 01:31
1

@Mystical: I really wish that I could split points some how. Both answers are really good, given that I essentially asked two questions. I gave you an upvote, at least, but I am going to take Dietrich's answer as the winner. Thanks! – Kumba Jan 08 '12 at 04:26

Kerrek SB · Answer 3 · 2012-01-08T01:26:09.487

Assuming you have a working mask for n bits, e.g.

// set the first n bits to 1, rest to 0
#define BITMASK1(n) ((1ULL << (n)) - 1ULL)

you can make a range bitmask by shifting again:

// set bits [k+1, n] to 1, rest to 0
#define BITNASK(n, k) ((BITMASK(n) >> k) << k)

The type of the result is unsigned long long int in any case.

As discussed, BITMASK1 is UB unless n is small. The general version requires a conditional and evaluates the argument twice:

#define BITMASK1(n) (((n) < sizeof(1ULL) * CHAR_BIT ? (1ULL << (n)) : 0) - 1ULL)

score 1 · Answer 4 · edited Apr 27 '19 at 18:30

1

What about:

#define BIT_MASK(n) (~(((~0ULL) >> (n)) << (n)))

This works on all endianess system, doing -1 to invert all bits doesn't work on big-endian system.

edited Apr 27 '19 at 18:30

jamesdlin

81,374
13
159
204

answered Jun 20 '13 at 04:55

jyavenard

2,142
1
26
35

Endianess shouldn't matter at all. Using `- 1` to invert all bits would not work on a system that doesn't use two's-complement, however. – jamesdlin Apr 27 '19 at 18:27

score 1 · Answer 5 · answered Sep 08 '14 at 22:56

A "traditional" formula (1ul<<n)-1 has different behavior on different compilers/processors for n=8*sizeof(1ul). Most commonly it overflows for n=32. Any added conditionals will evaluate n multiple times. Going 64-bits (1ull<<n)-1 is an option, but problem migrates to n=64.

My go-to formula is:

#define BIT_MASK(n) (~( ((~0ull) << ((n)-1)) << 1 ))

It does not overflow for n=64 and evaluates n only once.

As downside it will compile to 2 LSH instructions if n is a variable. Also n cannot be 0 (result will be compiler/processor-specific), but it is a rare possibility for all uses that I have(*) and can be dealt with by adding a guarding "if" statement only where necessary (and even better an "assert" to check both upper and lower boundaries).

(*) - usually data comes from a file or pipe, and size is in bytes. If size is zero, then there's no data, so code should do nothing anyway.

score 1 · Answer 6 · answered Jan 08 '12 at 01:34

1

#define BIT_MASK(foo) ((~ 0ULL) >> (64-foo))

I'm a bit paranoid about this. I think this assumes that unsigned long long is exactly 64 bits. But it's a start and it works up to 64 bits.

Maybe this is correct:

define BIT_MASK(foo) ((~ 0ULL) >> (sizeof(0ULL)*8-foo))

answered Jan 08 '12 at 01:34

Aaron McDaid

26,501
9
66
88

I must admit, this will warn with BIT_MASK(0) though. `warning: right shift count >= width of type` – Aaron McDaid Jan 08 '12 at 01:37

score 0 · Answer 7 · answered Jan 08 '12 at 01:38

Since you need to avoid shifting by as many bits as there are in the type (whether that's unsigned long or unsigned long long), you have to be more devious in the masking when dealing with the full width of the type. One way is to sneak up on it:

#define BIT_MASK(n) (((n) == CHAR_BIT * sizeof(unsigned long long)) ? \
                         ((((1ULL << (n-1)) - 1) << 1) | 1) : \
                           ((1ULL << (n  )) - 1))

For a constant n such as 64, the compiler evaluates the expression and generates only the case that is used. For a runtime variable n, this fails just as badly as before if n is greater than the number of bits in unsigned long long (or is negative), but works OK without overflow for values of n in the range 0..(CHAR_BIT * sizeof(unsigned long long)).

Note that CHAR_BIT is defined in <limits.h>.

user1415563 · Answer 8 · 2017-06-25T05:05:27.547

-2

@iva2k's answer avoids branching and is correct when the length is 64 bits. Working on that, you can also do this:

#define BIT_MASK(length) ~(((unsigned long long) -2) << length - 1);

gcc would generate exactly the same code anyway, though.

edited Jun 25 '17 at 05:05

answered Jun 22 '17 at 16:18

user1415563

7
3

1

This question ask for a C solution. – Stargateur Jun 22 '17 at 17:26

C macro to create a bit mask -- possible? And have I found a GCC bug?

8 Answers8