7

I started reading "Programming Pearls" today and while doing it's exercise I came across this question "How would you implement your own bit vector?". When I looked at the solution it was like this:

#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000

int a[1 + N/BITSPERWORD]; 

void set(int i) { a[i >> SHIFT] |= (1 << (i & MASK)); 

Where I am getting confused at is this statement

 1 << (i & MASK)

Could someone please explain me what's going on here?

test123
  • 13,865
  • 9
  • 28
  • 33

2 Answers2

4

Note that MASK is set such that it has the lowest SHIFT bits set, where SHIFT is exactly the base-2 logarithm of BITSPERWORD.

Therefore (i & MASK) will select the lowest 5 bits of i, which is the same as taking the remainder after dividing by 32 (just consider how taking the lowest two digits of a decimal number gives you the remainder after dividing by 100, for example). That gives the number of the bit within a word we're interested in.

1 << (i & MASK)) (which is, by the way, an expression, not a statement) now creates a value where exactly the bit we're interested in is set. Merging this value into the memory word with |= will set the desired bit of the bit vector.

hmakholm left over Monica
  • 23,074
  • 3
  • 51
  • 73
  • Thanks for the reply Henning. Would this be valid if I then replace `(i & MASK)` with `(i % 32)`? If it will be valid but not elegant then could you please shed some light on why `i & MASK` is preferred over `i % 32`? Thanks a lot. – test123 Aug 28 '11 at 03:18
  • Yes -- `i & MASK` and `i % 32` are the same thing as long as you're sure `i` is not negative. The bitwise AND is typically more efficient than a division with remainder, and has therefore become the traditional choice. Or at least it _used to_ be more efficient back when compilers where stupid. Today you can expect even a moderately optimizing compiler to rewrite `i % 32` to `i & 31` internally in this context (either it can prove that `i` is not negative, in which case the rewriting is always safe, or it can reason that a negative result would trigger undefined behavior in the shift anyway). – hmakholm left over Monica Aug 28 '11 at 03:32
2

0x20 is 32, so i & 0x1F takes i modulo 32, so that you never shift by 32 bits. This is a safeguard because shifting by anything that isn't strictly less than the size of the type is undefined behaviour.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084