I am using this code to generate a random permutation of a vector using a variation of Fisher-Yates randomization algorithm (I am going from the first element to the last, not the other way around). I am using a boost::random::mt11213b
RNG globally in a program that is seeded with generator.seed(time(NULL));
when program starts, hence a wrapper singleton RandomNumber
here.
boost::random::uniform_int_distribution<unsigned long>
distribution(0, vec.size()-1);
for (unsigned long i=0;i<vec.size();i++)
std::swap(vec[i], vec[distribution(RandomNumber::getInstance().generator)]);
Some experiments have led me to believe that there may be an issue in this algorithm, in a nutshell. This is what I did
- Created a vector of integer with length 100
- Filled first 75 elements with
0
and the last 25 with1
- Shuffled an array.
- Took the first 5 elements from the list and summed them.
I repeated this procedure a few thousand times (with a loop, not by hand :)) each time starting with a fresh vector. Then I computed the arithmetic mean of the sums and it came about 0.98
instead of the expected 1.25
.
The funny thing is that if I start with a vector that has been shuffled once with the same algorithm instead of an ordered one, the result increases to 1.22
and if I don't discard the vector on each iteration but rather just shuffle it again, the result is around 1.25
that is the expected value.
I am unsure as to what could be wrong. The algorithm looks sound, the only thing that I can think of that could have gone wrong is the seeding phase and the
boost::random::uniform_int_distribution<unsigned long>
distribution(0, vec.size()-1);
line that is called each time before a vector is shuffled (perhaps it should only be called once a program but that doesn't make mush sense)
Any help will be greatly appreciated!