1

I have a small positive integer n, and I use an unsigned integral type to store a mask containing n bits. Often, there is a need to construct a mask with all n bits set. For example, if n is 5, then the mask would be 0b11111u.

The typical way to construct such a mask is to do the following (this example assumes that the mask uses unsigned, but it is possible to write something for any unsigned integral type):

unsigned all_set_mask = (1u << n) - 1u;

However, 1u << n is undefined behaviour if n is exactly equal to the bit width of the unsigned integral type, as seen from [expr.shift#1]:

The operands shall be of integral or unscoped enumeration type and integral promotions are performed. The type of the result is that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to the width of the promoted left operand.

A reasonable interpretation of "construct a mask with all n bits set" arguably should permit the case when we have exactly as many bits as the bit width of the underlying integral type, and so the typical implementation does not support all reasonable inputs.

Furthermore, on modern processors, the assembly left shift instruction is a no-op when shifting by the bit width, and so all_set_mask might end up being 0, which isn't the expected answer in any case.

Is there an standard-compliant way to rewrite it without resorting to an if-statement or complex bit twiddling? I looked into <bit> but did not see anything of use there.

Bernard
  • 5,209
  • 1
  • 34
  • 64
  • Did you try the trivial solution of using `~0`, a.k.a. `-1`, a.k.a. all bits set, then a simple `>>` (well defined for unsigned types)? – Sam Varshavchik Jun 18 '23 at 15:07
  • Are you going to be doing this a lot? If you are, I would consider using a lookup table since it's only 32 or 64 elements and that gives you `unsigned all_set_mask = mask_table[n];` – NathanOliver Jun 18 '23 at 15:10
  • @NathanOliver-IsonStrike is a lookup table faster than just `shl`+`sub` instructions? – Bernard Jun 18 '23 at 15:18
  • @SamVarshavchik That seems to work, I didn't think of that. – Bernard Jun 18 '23 at 15:24
  • It doesn't have UB, which IMHO makes it better – NathanOliver Jun 18 '23 at 15:24
  • @SamVarshavchik That only works for n = 32 or 64, depending on the type. You'd want one expression that works for any n where the result is reasonable, for example 1 <= n <= 64 if you use 64 bit numbers. – gnasher729 Jun 18 '23 at 15:25
  • I tried @SamVarshavchik's solution on Godbolt (https://godbolt.org/z/Trrz1E77z), seems to be one extra instruction but I think essentially the same in terms of latency. – Bernard Jun 18 '23 at 15:26
  • 2
    Just saying: It's not just "undefined behaviour" in the sense of "the C standard says it's UB, but nobody cares". It's "undefined behaviour" in the sense of "it will go seriously wrong on important implementations". I've even seen compilers produce different results for a << b if a and b were known to the compiler or not. So 1ull << 64 and n = 64, 1ull << n can give different results. – gnasher729 Jun 18 '23 at 15:28
  • @Bernard You don't check if `n == 0` in your godbolt example. – Ted Lyngmo Jun 18 '23 at 15:31
  • FWIW here is a comparison including using a lookup table (I used placeholder values for simplicity): https://godbolt.org/z/od57jPrrj – NathanOliver Jun 18 '23 at 15:33
  • @TedLyngmo You mean for Sam's solution? Yeah, I agree that will be UB when `n == 0`. Whether I can use that directly will depend on whether `n` is allowed to be 0 here. – Bernard Jun 18 '23 at 15:36
  • Edited the question, even if the compiler emits the SHL instruction despite UB, the answer will be incorrect. – Bernard Jun 18 '23 at 15:42
  • 1
    duplicate: [Set last `n` bits in unsigned int](https://stackoverflow.com/q/8128674/995714). All the solutions in [Creating a mask with N least significant bits set](https://stackoverflow.com/q/52573447/995714) are also valid in C++ – phuclv Jun 18 '23 at 16:49
  • @NathanOliver-IsonStrike a lookup table can certainly eliminate UB, but it's often a poor choice because it's slower. A modern processor can execute many instructions in the time it takes to do a single memory access. – Mark Ransom Jun 18 '23 at 18:29

4 Answers4

2

A simple way is the following:

n == 0 ? 0 : 0xffffffff >> (32 - n)

We can rewrite this slightly to save one instruction on typical architectures where shift counts are modulo register width:

n == 0 ? 0 : 0xffffffff >> (-n & 31)

and then to get rid of the conditional:

(unsigned)(-(signed)n >> 31) >> (-n & 31)

This should compile to just three instructions on typical CPUs (one of the rare cases where assembly is more readable than C++).

Note that this assumes that signed right-shifts are arithmetic shifts, which is pedantically correct only from C++20 on.

Falk Hüffner
  • 4,942
  • 19
  • 25
1

You can do this unconditionally with any unsigned type:

(((1u << (n + 1) / 2) - 1u) << n / 2) | ((1u << n / 2) - 1u);

The example with 16-bit unsigned:

#include <iostream>

int main() {
  for (int n = 0; n <= 16; ++n) {
    const uint16_t r = (((1u << (n + 1) / 2) - 1u) << n / 2) | ((1u << n / 2) - 1u);
    std::cout << std::hex << r << " ";
  }
}

// Output: 0 1 3 7 f 1f 3f 7f ff 1ff 3ff 7ff fff 1fff 3fff 7fff ffff
273K
  • 29,503
  • 10
  • 41
  • 64
1

I suggest setting all bits and right shifting away the the unwanted bits.

You still need to test for the edge case when n is 0 though, otherwise it'll shift away all bits, with undefined behavior as a result.

template <class T>
    requires std::is_unsigned_v<T>
constexpr T all_set_mask(unsigned n) {
    if(n == 0) return T{};
    
    constexpr unsigned  bits = sizeof(T) * CHAR_BIT;
    // extra test for too many bits if needed:
    //if(n > bits) return static_cast<T>(-1);
    
    return static_cast<T>(-1) >> (bits - n);
}

Demo

Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108
0

On x86-64, _bextr_u64(UINT64_C(-1), 0, n).

jbapple
  • 3,297
  • 1
  • 24
  • 38
  • 2
    This works, but (on Intel processors) it's more efficient to use `_bzhi_u64(UINT64_C(-1), n)` – harold Jun 18 '23 at 19:38
  • Hmm if I understand the instruction correctly, `_bzhi_u32((unsigned)(-1), n)` would suffice if 0 <= n <= 32? – Bernard Jun 19 '23 at 13:12