1

I'm making this chess engine, well, attempting to make a chess engine. A few of the confusing things that keep popping up are these seemingly arbitrary arrays of numbers from 0 to 63, the so-called "De Bruijn sequence" and a function that no one on the internet seems to understand fully. A person tried to figure it out here. the code goes like this:

const int BitTable[64] = {
  63, 30, 3, 32, 25, 41, 22, 33, 15, 50, 42, 13, 11, 53, 19, 34, 61, 29, 2,
  51, 21, 43, 45, 10, 18, 47, 1, 54, 9, 57, 0, 35, 62, 31, 40, 4, 49, 5, 52,
  26, 60, 6, 23, 44, 46, 27, 56, 16, 7, 39, 48, 24, 59, 14, 12, 55, 38, 28,
  58, 20, 37, 17, 36, 8
};

int pop_1st_bit(uint64 *bb) {
  uint64 b = *bb ^ (*bb - 1);
  unsigned int fold = (unsigned) ((b & 0xffffffff) ^ (b >> 32));
  *bb &= (*bb - 1);
  return BitTable[(fold * 0x783a9b23) >> 26];
}

uint64 index_to_uint64(int index, int bits, uint64 m) {
  int i, j;
  uint64 result = 0ULL;
  for(i = 0; i < bits; i++) {
    j = pop_1st_bit(&m);
    if(index & (1 << i)) result |= (1ULL << j);
  }
  return result;
}

the post I specified above did a great deal of explaining what the code is doing (finding the least significant bit and turning it into a 64 array index and vice versa), but it is still a mystery as for the how it's doing it. The origins of the mysterious BitTable[64] array and the 0x783a9b23 magic number are up in the air.

To add further confusion that is not the only mysterious array and magic number that pops up in chess programming, here are a few more taken from the chess programming wiki:

const int index64[64] = {
    0,  1, 48,  2, 57, 49, 28,  3,
   61, 58, 50, 42, 38, 29, 17,  4,
   62, 55, 59, 36, 53, 51, 43, 22,
   45, 39, 33, 30, 24, 18, 12,  5,
   63, 47, 56, 27, 60, 41, 37, 16,
   54, 35, 52, 21, 44, 32, 23, 11,
   46, 26, 40, 15, 34, 20, 31, 10,
   25, 14, 19,  9, 13,  8,  7,  6
};

/**
 * bitScanForward
 * @author Martin Läuter (1997)
 *         Charles E. Leiserson
 *         Harald Prokop
 *         Keith H. Randall
 * "Using de Bruijn Sequences to Index a 1 in a Computer Word"
 * @param bb bitboard to scan
 * @precondition bb != 0
 * @return index (0..63) of least significant one bit
 */
int bitScanForward(U64 bb) {
   const U64 debruijn64 = C64(0x03f79d71b4cb0a89);
   assert (bb != 0);
   return index64[((bb & -bb) * debruijn64) >> 58];
}

That 25 year old code uses a whole different array of seemingly random numbers and the "magic number" 0x03f79d71b4cb0a89 (?) and shifts a bit expression bb & -bb by 58 of all numbers (why!?). here's another one:

const int index64[64] = {
    0, 47,  1, 56, 48, 27,  2, 60,
   57, 49, 41, 37, 28, 16,  3, 61,
   54, 58, 35, 52, 50, 42, 21, 44,
   38, 32, 29, 23, 17, 11,  4, 62,
   46, 55, 26, 59, 40, 36, 15, 53,
   34, 51, 20, 43, 31, 22, 10, 45,
   25, 39, 14, 33, 19, 30,  9, 24,
   13, 18,  8, 12,  7,  6,  5, 63
};

/**
 * bitScanForward
 * @author Kim Walisch (2012)
 * @param bb bitboard to scan
 * @precondition bb != 0
 * @return index (0..63) of least significant one bit
 */
int bitScanForward(U64 bb) {
   const U64 debruijn64 = C64(0x03f79d71b4cb0a89);
   assert (bb != 0);
   return index64[((bb ^ (bb-1)) * debruijn64) >> 58];
}

The same confusing "magic number", but a different array and bit expression bb ^ (bb-1). And of course, another one.

const int lsb_64_table[64] =
{
   63, 30,  3, 32, 59, 14, 11, 33,
   60, 24, 50,  9, 55, 19, 21, 34,
   61, 29,  2, 53, 51, 23, 41, 18,
   56, 28,  1, 43, 46, 27,  0, 35,
   62, 31, 58,  4,  5, 49, 54,  6,
   15, 52, 12, 40,  7, 42, 45, 16,
   25, 57, 48, 13, 10, 39,  8, 44,
   20, 47, 38, 22, 17, 37, 36, 26
};

/**
 * bitScanForward
 * @author Matt Taylor (2003)
 * @param bb bitboard to scan
 * @precondition bb != 0
 * @return index (0..63) of least significant one bit
 */
int bitScanForward(U64 bb) {
   unsigned int folded;
   assert (bb != 0);
   bb ^= bb - 1;
   folded = (int) bb ^ (bb >> 32);
   return lsb_64_table[folded * 0x78291ACF >> 26];
}

That one has a different array, magic number, and bit shifts it by 26 (again, why!?). the point is, I want to know how do people come up with these arrays and numbers, that is the core of my question. I am aware that there are some high-level mathematics involved, but surely you don't have to have a Phd in computer science to pick up a hobby in chess programming... right?

  • 3
    Um, what is your *question*? – zwol Jun 05 '22 at 03:28
  • 3
    If all you want is the index of the lowest set bit then just use the compiler builtin for it. Modern CPUs have opcodes for doing this much better than the magic function from 25 years ago. – Goswin von Brederlow Jun 05 '22 at 04:00
  • Wikipedia has an [understandable explanation here](https://en.wikipedia.org/wiki/De_Bruijn_sequence#Finding_least-_or_most-significant_set_bit_in_a_word). De Brujin sequences are not unique: I think there are 2^32 choices of magic number and array for the middle two versions and 2^16 choices for other two. Wikipedia also tells you how to calculate them. And you surely don't "need a PhD" to write this function; it *could* have been a bitshifting `for` loop. The PhD is for the speed. (Until [some uppity silicon](https://www.felixcloutier.com/x86/bsf) steals your thunder.) – HTNW Jun 05 '22 at 04:10
  • yeah, to put it simply, I want to know how do people come up with those array of random numbers and that elusive magic number, that is my question – Cheesewaffle Jun 05 '22 at 04:55
  • @Cheesewaffle Do you an understanding of combinatorial mathematics? Because, if you do, the explanation is not trivial, but is relatively straight-forward. But, if you are seeking an explanation that doesn't rely on you having a working understanding of combinatorial mathematics, it would be much harder for someone to explain. – Peter Jun 05 '22 at 06:27
  • @Peter surface level, well, atleast not enough to figure out this thing on my own – Cheesewaffle Jun 05 '22 at 07:33
  • @GoswinvonBrederlow and C++20 already has [`std::countr_zero`](https://en.cppreference.com/w/cpp/numeric/countr_zero) to do that efficiently – phuclv Jun 05 '22 at 08:38
  • 1
    [Explanation on how DeBruijn sequence works](https://stackoverflow.com/a/757266/995714), [What does (number & -number) mean in bit programming?](https://stackoverflow.com/q/35861228/995714), [Counting number of bits: How does this line work ? n=n&(n-1);](https://stackoverflow.com/q/15370250/995714), [Why is "i & (i ^ (i - 1))" equivalent to "i & (-i)"](https://stackoverflow.com/q/24772669/995714), [Why does n & (n - 1) always clear 1 bit from n?](https://stackoverflow.com/q/59398864/995714) – phuclv Jun 05 '22 at 08:48

2 Answers2

3

The 64-bit and 32-bit cases are very different, the 32-bit case is much harder to understand and I do not claim to. Interestingly, Matt Taylor didn't seem to when he first came up with the idea (see: https://groups.google.com/g/comp.lang.asm.x86/c/3pVGzQGb1ys). From reading this discussion I think that he came up with the idea through intuition and then brute force searched for magic numbers that would work.

The reason why the 32-bit version is more complicated is that it is that the magics are not De Bruijn sequences. As you have correctly identified in the answer you wrote to yourself, the 64-bit version works by using such a sequence. This is much easier to see if we deal with smaller integers, so we will look at using 8-bit values to find an index for an array with 8 members:

Isolated     De Bruijn                  Right shift
  LS1B                                    5 bits
0000 0001  *  00010111  =  00010111        000  (dec: 0)
0000 0010  *  00010111  =  00101110        001  (dec: 1)
0000 0100  *  00010111  =  01011100        010  (dec: 2)
0000 1000  *  00010111  =  10111000        101  (dec: 5)
0001 0000  *  00010111  =  01110000        011  (dec: 3)
0010 0000  *  00010111  =  11100000        111  (dec: 7)
0100 0000  *  00010111  =  11000000        110  (dec: 6)
1000 0000  *  00010111  =  10000000        100  (dec: 4)

Now we build the array to match the values that the above gives. This works because an 8-bit De Bruijn is guaranteed to contain each possible log2(8) = 3-bit sequence exactly once. (this article is insightful with regards to the 64-bit version http://supertech.csail.mit.edu/papers/debruijn.pdf)

The 32-bit case is a bit different because we are no longer multiplying the magic by a power of two, so much more is going on in the multiplication than just a left shift.

To repeat the 8-bit version from above, this would be the same as performing the "fold" to obtain 4 bits:

 bb            bb^(bb-1)      fold   magic               Right shift
                                                            1 bit
XXXX XXX1  ->  0000 0001  ->  0001 * 0101  =  0101      010  (dec: 2)
XXXX XX10  ->  0000 0011  ->  0011 * 0101  =  1111      111  (dec: 7)
XXXX X100  ->  0000 0111  ->  0111 * 0101  =  0011      001  (dec: 1)
XXXX 1000  ->  0000 1111  ->  1111 * 0101  =  1011      101  (dec: 5)
XXX1 0000  ->  0001 1111  ->  1110 * 0101  =  0110      011  (dec: 3)
XX10 0000  ->  0011 1111  ->  1100 * 0101  =  1100      110  (dec: 6)
X100 0000  ->  0111 1111  ->  1000 * 0101  =  1000      100  (dec: 4)
1000 0000  ->  1111 1111  ->  0000 * 0101  =  0000      000  (dec: 0)

There are a couple of important things that we can notice here: 1) the De Bruijn and Taylor methods give a different ordering of the 3-bit values we want to index the table, and 2) a different bit shift is required to obtain the 3-bit result.

The different ordering of the 3-bit values means that we need a different lookup table for each method. If the LS1B is in:

  • bit 0, De Bruijn gives 0 and Taylor gives 2
  • bit 1, De Bruijn gives 1 and Taylor gives 7
  • ...
  • bit 6, De Bruijn gives 6 and Taylor gives 4
  • bit 7, De Bruijn gives 4 and Taylor gives 0

The resulting lookup tables for each case are:

De Bruijn  {0, 1, 2, 4, 7, 3, 6, 5}
Taylor     {7, 2, 0, 4, 6, 3, 5, 1}

To calculate the bit shift for the De Bruijn method with a De Bruijn of length n we use n - log2(n). For the Talyor method, we are using an n-bit number to calculate an index for an array of size 2n. So, the bit shift will be: n - log2(2n) or number of bits in key - log2(size of array). In the above cases, we have an 8-bit De Bruijn sequence so our shift is 8 -log2(8) = 5, and (with the Taylor method) we have 4-bit key and array of size 8 so the shift is 4 - log2(8) = 1.

Finding "Taylor Numbers"

The only way that I know of to find these magic "Taylor numbers" is through brute force. I have found:

  • 1 x 4-bit Taylor number for indexing an array of 8 elements (0b0101, dec: 5, used above)
  • 1 x 8-bit Taylor number for indexing an array of 16 (99)
  • 2 x 16-bit Taylor numbers for indexing an array of 32 (28839, 28883)
  • 4 x 32-bit Taylor numbers for indexing an array of 64 (2015959759, 2015960475, 2016185679, 2017106723)

The method I used to brute-force for an N-bit Taylor is to consider each N-bit unsigned integer in turn (these are the potential Taylor numbers). For each of these candidate numbers, multiply the candidate by each of the N possible folded bitboards. Right shift each of these N numbers by N-log2(2N), and check that the resulting sequence contains each of the numbers (0,..., N-1) exactly once.

I don't know the general rule for finding these Taylor numbers but all of the ones I have found are in the first half of the possible values of the integer (i.e., the most significant bit is zero), and when multiples are found they are close together.

Also, another interesting feature is that as the number of bits increases, the values seem to get asymptotically closer to half the maximum value of the integer size:

  • 5 / 2^4 = 0.3125
  • 99 / 2^8 = 0.3867
  • 28839 / 2^16 = 0.4400
  • 2015959759 / 2^32 = 0.4694

Additionally, some people in the other thread you linked to suggested that the 0x783A9B23 being prime had something to do with the way this works, which cannot be the case since 99, 28839, 28883, 2015959759, 2015960475, 2016185679 are not prime and work perfectly well.

Below I have included the 32-bit Taylor numbers that I brute-forced along with their corresponding look up tables for reference. Changing the Taylor number changes the value that is output for multiplying a given folded bitboard, thus the order of the numbers in the table needs to change.

2015959759 = 0x78291ACF

const int BitTable[64] = {
    63, 30, 3, 32, 59, 14, 11, 33, 60, 24, 50, 9, 55, 19, 21, 34, 
    61, 29, 2, 53, 51, 23, 41, 18, 56, 28, 1, 43, 46, 27, 0, 35, 
    62, 31, 58, 4, 5, 49, 54, 6, 15, 52, 12, 40, 7, 42, 45, 16,  
    25, 57, 48, 13, 10, 39, 8, 44, 20, 47, 38, 22, 17, 37, 36, 26
}

2015960475 = 0x78291D9B

const int BitTable[64] = {
    63, 30, 3, 32, 50, 14, 11, 33, 23, 51, 58, 9, 54, 27, 20, 34, 
    61, 29, 2, 52, 59, 22, 41, 26, 19, 55, 1, 43, 46, 18, 0, 35, 
    62, 31, 49, 4, 5, 57, 53, 6, 60, 15, 12, 40, 7, 42, 45, 24, 
    16, 48, 56, 13, 10, 39, 8, 44, 47, 28, 38, 21, 25, 37, 36, 17
}

2016185679 = 0x782C8D4F

const int BitTable[64] = {
    63, 30, 3, 32, 59, 15, 12, 33, 60, 24, 51, 22, 20, 43, 9, 34, 
    61, 29, 2, 41, 11, 52, 54, 19, 56, 28, 1, 44, 47, 27, 0, 35, 
    62, 31, 58, 4, 5, 50, 42, 6, 16, 40, 13, 53, 55, 7, 46, 17, 
    25, 57, 49, 14, 39, 23, 21, 45, 8, 48, 38, 10, 18, 37, 36, 26
}

2017106723 = 0x783A9B23

const int BitTable[64] = {
    63, 30, 3, 32, 25, 41, 22, 33, 15, 50, 42, 13, 11, 53, 19, 34, 
    61, 29, 2, 51, 21, 43, 45, 10, 18, 47, 1, 54, 9, 57, 0, 35, 62, 
    31, 40, 4, 49, 5, 52, 26, 60, 6, 23, 44, 46, 27, 56, 16, 7, 39, 
    48, 24, 59, 14, 12, 55, 38, 28, 58, 20, 37, 17, 36, 8
}
44stonelions
  • 192
  • 1
  • 10
1

OK, I think I got the second one, here answering my own question, and with the help of the comments. bear with me here, I'm gonna really dumb my wordings down so even someone with basic knowledge could understand. But before that, some basic bitwise operations and the DeBruijn sequence:

  1. The De Bruijn sequence

Basically, The De Bruijn sequences (DBS for short) are cyclic "strings" of digits of base k of which substring of length n appears once (notated by B(k, n)). We only need to concern ourselves with binary operations (B(2, n)). for example, say you want B(2, 2), there are 4 (2^2, notice the pattern here) possible combinations of a binary number with length 2, namely 00, 01, 10, 11. So, the DBS for that is 0011 (notice again the length of the DBS of 4) with 00, 01, 10, and 11 appearing only once in that sequence, remember, DBS are cyclic.

{11}00
1{10}0
11{00}
1}10{0

now, a chess board contains 64 squares, indexed from 0 to 63, and to represent 63 in binary, we need at least 6 bits (111111 in binary is 63) so we need a B(2, 6) DBS.

  1. Bit manipulation

I'm assuming we all know the basic ^, ~, &, etc. operations, the thing I'm trying to picture here are the expressions used bb ^ (bb-1), bb & -bb, and the variations used in the example I gave, using more digestible sizes of bb, take bb as an unsigned 8-bit instead of 64-bit:

bb       = 0110 0100 = 100
-bb      = 1001 1100 = 156
bb & -bb = 0000 0100 = 8 //The value is irrelevant, we just need to mask the least significant bit

So, what is going on here? well, since bb is an unsigned 8-bit number, it can store values from 0 to 255 (2^8 different values). What happens when you add a minus sign to an unsigned integer? think of it this way, if moving forwards from 0 leads you to 0, 1, 2, 3, etc. then moving backward will loop you back to the highest value, so 0, 255, 254, 253, etc. Thus, if bb has a value of 100 (100 steps forward from 0) then -bb will have a value of 156 (100 steps backward from 0)

a similar thing happens when you inverse a binary number, bb = 0110 0100 which has a value of 100 (100 steps forward from 0) will of course has an inverse of ~bb = 1001 1011 which has a value of 155 (100 steps backward from 255, again notice the pattern here)

with a bit of logic, one can conclude that -bb = ~bb + 1. the point is we can now isolate the least significant bit on the number, so bb & -bb isolates the least significant bit (LSB)

now onto the next expression b^(b-1):

bb            = 0110 0100
bb - 1        = 0110 0011
bb ^ (bb - 1) = 0000 0111

so, the separation bb ^ (bb-1) contains all bits set including and below the LS1B

const int index64[64] = {
    0,  1, 48,  2, 57, 49, 28,  3,
   61, 58, 50, 42, 38, 29, 17,  4,
   62, 55, 59, 36, 53, 51, 43, 22,
   45, 39, 33, 30, 24, 18, 12,  5,
   63, 47, 56, 27, 60, 41, 37, 16,
   54, 35, 52, 21, 44, 32, 23, 11,
   46, 26, 40, 15, 34, 20, 31, 10,
   25, 14, 19,  9, 13,  8,  7,  6
};

/**
 * bitScanForward
 * @author Martin Läuter (1997)
 *         Charles E. Leiserson
 *         Harald Prokop
 *         Keith H. Randall
 * "Using de Bruijn Sequences to Index a 1 in a Computer Word"
 * @param bb bitboard to scan
 * @precondition bb != 0
 * @return index (0..63) of least significant one bit
 */
int bitScanForward(U64 bb) {
   const U64 debruijn64 = C64(0x03f79d71b4cb0a89);
   assert (bb != 0);
   return index64[((bb & -bb) * debruijn64) >> 58];
}

this one is the most straightforward, that array is based on that 0x03f79d71b4cb0a89 number which is actually a B(2, 6) DBS, 0000 0011 1111 0111 1001 1101 0111 0001 1011 0100 1100 1011 0000 1010 1000 1001 to be precise. Now, that array is generated by locating the numbers 0 to 63 in the DBS, for example 0 = 000000 is the 0th substring on the DBS, 1 = 000001 is the 1st, 2 = 000010 is the 48th (here 0000 0011 1111 0111 1001 1101 0111 0001 1011 0100 1100 1011 [0000 10]10 1000 1001), and so on.

say we have a number bb = 0x00FF00000000FF00 or 0000 0000 1111 1111 0000 0000 0000 0000 0000 0000 0000 0000 1111 1111 0000 0000, we know the LSB of bb is of index 8

we use the bb & -bb operator

bb       = 0x00FF00000000FF00
debruijn = 0x03f79d71b4cb0a89
         = = 0000 0011 1111 0111 1001 1101 0111 0001 1011 0100 1100 1011 0000 1010 1000 1001
bb & -bb = 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 0000 0000

we multiply bb & -bb with the DBS, and since bb & -bb is just 2^(index), this is the same as shifting the DBS (index) to the right (DBS << index)

(bb & -bb) * debruijn = 1111 0111 1001 1101 0111 0001 1011 0100 1100 1011 0000 1010 1000 1001 0000 0000

shifting the DBS << 8 which is the same as taking the 8th substring in the DBS, to isolate that substring we need to shift the number 64-6=58 to the left

eighth substring in the DBS = 0000 0011 [1111 01]11 1001 1101 0111 0001 1011 0100 1100 1011 0000 1010 1000 1001
isolated eighth substring   = 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0011 1101 = 61

if we index 61 to our array it spits out 8 which is the number that we want. still no insight on the other three though