Fast way to avoid modulo bias

Question

I'm doing a shuffle and it gets done very often on a small array. Could be anything from 1 - 10 elements.

I've tried the accepted answer in this question:

Is this C implementation of Fisher-Yates shuffle correct?

Unfortunately it's extremely slow.

I need a faster way of doing this and avoiding modulo bias which I'm seeing. Any suggestions?

EDIT: Sorry I should point out that it's not the shuffle that's slow, it's the method used to generate a random int range. i.e. rand_int(). I'm using a Mersenne twister algorithm and RAND_MAX in my case is UINT_MAX to help out. This of course makes it slower when n is much smaller than RAND_MAX

I've also found 2 implementations of a rand_int type function.

static int rand_int(int n) {
  int limit = RAND_MAX - RAND_MAX % n;
  int rnd;

  do {
    rnd = rand();
  } while (rnd >= limit);
  return rnd % n;
}

The following is much much faster. But, does it avoid the modulo bias problem?

int rand_int(int limit) {

    int divisor = RAND_MAX/(limit);
    int retval;

    do { 
        retval = rand() / divisor;
    } while (retval > limit);

    return retval;
}

How random does the shuffle have to be? You could pre-generate a bunch of shuffles for your average array sizes, store them in a look-up table, and then simply iterate through those randomized indices. Expand the number of pre-generated LUT entries to whatever memory envelope and make sure the number of entries is a power of 2 for cheap modulo operations on the LUT indices. The LUT array would be lut[small array size][0..2**n] -> "random" array indices — Ross Rogers, Nov 29 '11 at 23:46
could you describe how the modulo bias appears, what exactly you're getting? — Cheers and hth. - Alf, Nov 29 '11 at 23:48
Alf. it was simply that some values were appearing too often in the output. Usually in the bottom end. I'm not getting this after implementing some changes from sehe's answer. — hookenz, Nov 30 '11 at 02:23

sehe · Accepted Answer · 2011-11-30T00:06:19.237

7

Edit

To address the basic question on avoiding the modulo bias with rand() see http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx.

In short, you can't get truly uniform other than skipping non-domain random numbers¹; The article lists some formulae to get a smaller bias (int r = rand() / ( RAND_MAX / N + 1 ) eg) without sacrificing more performance.

¹ See Java's implementation of Random.nextInt(int): http://download.oracle.com/javase/1.4.2/docs/api/java/util/Random.html#nextInt(int)

Using C++

You should be able to use std::random_shuffle (from <algorithm> header);

If you must roll your own shuffle implementation, I suggest using std::random (TR1, C++0x or Boost). It comes with a number of generators and distributions, with varying performance characteristics.

#include <random>

std::mt19937 rng(seed);
std::uniform_int_distribution<int> gen(0, N); // uniform, unbiased

int r = gen(rng);

Refer to the boost documentation for a good overview of Boost Random generator and distribution characteristics:

http://www.boost.org/doc/libs/1_47_0/doc/html/boost_random/reference.html#boost_random.reference.generators

Here is a sample of doing std::random_shuffle using Boost Random, directly:

#include <algorithm>
#include <functional>
#include <vector>
#include <boost/random.hpp>

struct Rng
{
    Rng(boost::mt19937 &rng) : _rng(rng) {}

    unsigned operator()(unsigned i) 
    {
        boost::uniform_int<> dist(0, i - 1);
        return dist(_rng);
    }

  private:        
    boost::mt19937 &_rng;
};

boost::mt19937 state;
std::random_shuffle(v.begin(), v.end(), Rng(state));

edited Nov 30 '11 at 00:06

answered Nov 29 '11 at 23:45

sehe

374,641
47
450
633

Sorry I can't use boost or C++0X. – hookenz Nov 29 '11 at 23:58
@MattH: so that leaves TR1 as an option. std::mt19937 is in TR1 too – sehe Nov 30 '11 at 00:02
@MattH: Edited answer with more info regarding modulo bias with `rand` – sehe Nov 30 '11 at 00:06
@MattH: also, on the topic of `std::random_shuffle`, what is your verdict? It has been in STL ever since SGI... – sehe Nov 30 '11 at 00:11
thanks. Didn't know about std::random_shuffle. I had implemented my own, partly because the boss doesn't like me using the std library too much but rather an internal library. I'm changing my mind about that. – hookenz Nov 30 '11 at 00:47
2

Thanks, that first link is priceless!. – hookenz Nov 30 '11 at 00:56
2

As of 2019-04-30 16:30 -07:00, the first link has a 500 error (internal server error). However, you can find the article in the Wayback Machine (Internet Archive) up to 2018-08-01: https://web.archive.org/web/20180801210127/http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx – Jonathan Leffler Apr 30 '19 at 23:41

Fast way to avoid modulo bias

1 Answers1

Using C++