16

How does this function work?

A mask with the least significant n bits set to 1.

Example:

n = 6 --> 0x2F, n = 17 --> 0x1FFFF // I don't get these at all, especially how n = 6 --> 0x2F

Also, what is a mask?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
sebi
  • 815
  • 7
  • 15
  • 26

6 Answers6

32

The usual way is to take a 1, and shift it left n bits. That will give you something like: 00100000. Then subtract one from that, which will clear the bit that's set, and set all the less significant bits, so in this case we'd get: 00011111.

A mask is normally used with bitwise operations, especially and. You'd use the mask above to get the 5 least significant bits by themselves, isolated from anything else that might be present. This is especially common when dealing with hardware that will often have a single hardware register containing bits representing a number of entirely separate, unrelated quantities and/or flags.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
11

A mask is a common term for an integer value that is bit-wise ANDed, ORed, XORed, etc with another integer value.

For example, if you want to extract the 8 least significant digits of an int variable, you do variable & 0xFF. 0xFF is a mask.

Likewise if you want to set bits 0 and 8, you do variable | 0x101, where 0x101 is a mask.

Or if you want to invert the same bits, you do variable ^ 0x101, where 0x101 is a mask.

To generate a mask for your case you should exploit the simple mathematical fact that if you add 1 to your mask (the mask having all its least significant bits set to 1 and the rest to 0), you get a value that is a power of 2.

So, if you generate the closest power of 2, then you can subtract 1 from it to get the mask.

Positive powers of 2 are easily generated with the left shift << operator in C.

Hence, 1 << n yields 2n. In binary it's 10...0 with n 0s.

(1 << n) - 1 will produce a mask with n lowest bits set to 1.

Now, you need to watch out for overflows in left shifts. In C (and in C++) you can't legally shift a variable left by as many bit positions as the variable has, so if ints are 32-bit, 1<<32 results in undefined behavior. Signed integer overflows should also be avoided, so you should use unsigned values, e.g. 1u << 31.

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
10

For both correctness and performance, the best way to accomplish this has changed since this question was asked back in 2012 due to the advent of BMI instructions in modern x86 processors, specifically BLSMSK.

Here's a good way of approaching this problem, while retaining backwards compatibility with older processors.

This method is correct, whereas the current top answers produce undefined behavior in edge cases.

Clang and GCC, when allowed to optimize using BMI instructions, will condense gen_mask() to just two ops. With supporting hardware, be sure to add compiler flags for BMI instructions: -mbmi -mbmi2

#include <inttypes.h>
#include <stdio.h>

uint64_t gen_mask(const uint_fast8_t msb) {
  const uint64_t src = (uint64_t)1  << msb;
  return (src - 1) ^ src;
}

int main() {
  uint_fast8_t msb;
  for (msb = 0; msb < 64; ++msb) {
    printf("%016" PRIx64 "\n", gen_mask(msb));
  }
  return 0;
}
Mike M.
  • 38,532
  • 8
  • 99
  • 95
  • 1
    Sorry, that's a misunderstanding: I would have used the width as parameter (like the N that the OP mentioned), but since you use the index of the MSB it's actually consistent. – Ulrich Eckhardt Jun 29 '15 at 19:59
  • What does the consting do in this case? – Arran Cudbard-Bell Aug 27 '15 at 03:40
  • 2
    I think what @UlrichEckhardt meant was that if msb defines the inclusive most-significant bit (which is the typical usage), then your mask is too short by 1 bit. By the inclusive definition, an msb of 1 should select bits 1 and 0, thus a mask of 0x3, but your code produces 0x1. `src = 1LL << (msb + 1)` is better. Or -- change your variable name from "msb" to "num_of_bits" and then you are correct. – Jonathan Mayer Jan 27 '18 at 01:52
  • 1
    Because the input has been shifted by one this solution works for all bits set (`msb == 63`), but you can no longer ask for a mask with no bits set, since `msb == 0` gives you the bottom bit set. – BeeOnRope Sep 27 '18 at 22:20
  • What are some examples of edge cases? – Peter Mortensen Aug 14 '23 at 02:55
2

First, for those who only want the code to create the mask:

uint64_t bits = 6;
uint64_t mask = ((uint64_t)1 << bits) - 1;
# Results in 0b111111 (or 0x03F)

Thanks to @Benni who asked about using bits = 64. If you need the code to support this value as well, you can use:

uint64_t bits = 6;
uint64_t mask = (bits < 64)
  ? ((uint64_t)1 << bits) - 1
  : (uint64_t)0 - 1

For those who want to know what a mask is:

A mask is usually a name for value that we use to manipulate other values using bitwise operations such as AND, OR, XOR, etc.

Short masks are usually represented in binary, where we can explicitly see all the bits that are set to 1.

Longer masks are usually represented in hexadecimal, that is really easy to read once you get a hold of it.

You can read more about bitwise operations in C here.

Daniel Trugman
  • 8,186
  • 20
  • 41
0

I believe your first example should be 0x3f.

0x3f is hexadecimal notation for the number 63 which is 111111 in binary, so that last 6 bits (the least significant 6 bits) are set to 1.

The following little C program will calculate the correct mask:

#include <stdarg.h>
#include <stdio.h>

int mask_for_n_bits(int n)
{
    int mask = 0;

    for (int i = 0; i < n; ++i)
        mask |= 1 << i;

    return mask;
}

int main (int argc, char const *argv[])
{
    printf("6: 0x%x\n17: 0x%x\n", mask_for_n_bits(6), mask_for_n_bits(17));
    return 0;
}
Torsten
  • 834
  • 9
  • 13
  • The return type for `mask_for_n_bits` should be `unsigned`, as well as the type of `mask` and the update expression should use `mask |= 1U << 1;` – chqrlie May 15 '22 at 22:20
0

0x2F is 0010 1111 in binary - this should be 0x3f, which is 0011 1111 in binary and which has the 6 least-significant bits set.

Similarly, 0x1FFFF is 0001 1111 1111 1111 1111 in binary, which has the 17 least-significant bits set.

A "mask" is a value that is intended to be combined with another value using a bitwise operator like &, | or ^ to individually set, unset, flip or leave unchanged the bits in that other value.

For example, if you combine the mask 0x2F with some value n using the & operator, the result will have zeroes in all but the 6 least significant bits, and those 6 bits will be copied unchanged from the value n.

In the case of an & mask, a binary 0 in the mask means "unconditionally set the result bit to 0" and a 1 means "set the result bit to the input value bit". For an | mask, an 0 in the mask sets the result bit to the input bit and a 1 unconditionally sets the result bit to 1, and for an ^ mask, an 0 sets the result bit to the input bit and a 1 sets the result bit to the complement of the input bit.

jweyrich
  • 31,198
  • 5
  • 66
  • 97
caf
  • 233,326
  • 40
  • 323
  • 462