19

A poker deck has 52 cards and thus 52! or roughly 2^226 possible permutations.

Now I want to shuffle such a deck of cards perfectly, with truly random results and a uniform distribution, so that you can reach every single one of those possible permutations and each is equally likely to appear.

Why is this actually necessary?

For games, perhaps, you don't really need perfect randomness, unless there's money to be won. Apart from that, humans probably won't even perceive the "differences" in randomness.

But if I'm not mistaken, if you use shuffling functions and RNG components commonly built into popular programming languages, you will often get no more than 32 bits of entropy and 2^32 states. Thus, you will never be able to reach all 52! possible permutations of the deck when shuffling, but only about ...

0.000000000000000000000000000000000000000000000000000000005324900157 %

... of the possible permutations. That means a whole lot of all the possible games that could be played or simulated in theory will never actually be seen in practice.

By the way, you can further improve the results if you don't reset to the default order every time before shuffling but instead start with the order from the last shuffle or keep the "mess" after a game has been played and shuffle from there.

Requirements:

So in order to do what is described above, one needs all of the following three components, as far as I have understood:

  1. A good shuffling algorithm that ensures a uniform distribution.
  2. A proper RNG with at least 226 bits of internal state. Since we're on deterministic machines, a PRNG will be all we'll get, and perhaps this should be a CSPRNG.
  3. A random seed with at least 226 bits of entropy.

Solutions:

Now is this achievable? What do we have?

  1. Fisher-Yates shuffle will be fine, as far as I can see.
  2. The xorshift7 RNG has more than the required 226 bits of internal state and should suffice.
  3. Using window.crypto.getRandomValues we can generate the required 226 bits of entropy to be used as our seed. If that still isn't enough, we can add some more entropy from other sources.

Question:

Are the solutions (and also the requirements) mentioned above correct? How can you implement shuffling using these solutions in JavaScript in practice then? How do you combine the three components to a working solution?

I guess I have to replace the usage of Math.random in the example of the Fisher-Yates shuffle with a call to xorshift7. But that RNG outputs a value in the [0, 1) float range and I need the [1, n] integer range instead. When scaling that range, I don't want to lose the uniform distribution. Moreover, I wanted about 226 bits of randomness. If my RNG outputs just a single Number, isn't that randomness effectively reduced to 2^53 (or 2^64) bits because there are no more possibilities for the output?

In order to generate the seed for the RNG, I wanted to do something like this:

var randomBytes = generateRandomBytes(226);

function generateRandomBytes(n) {
    var data = new Uint8Array(
        Math.ceil(n / 8)
    );
    window.crypto.getRandomValues(data);

    return data;
}

Is this correct? I don't see how I could pass randomBytes to the RNG as a seed in any way, and I don't know how I could modify it to accep this.

msanford
  • 11,803
  • 11
  • 66
  • 93
caw
  • 30,999
  • 61
  • 181
  • 291
  • 2
    you don't need all that complexity, just use the getRandomValues; it's going to be better than anything suggested. You can pull a couple hundred or so Uint8s, of which you'll use 52 (one slot per full deck card). mod each int by 64, and discard if it's been used or if it's > 52, otherwise add it to the list of good cards. you don't need further shuffling, stretching, conditioning, or anything like that to visit each permutation. – dandavis Nov 09 '17 at 04:58
  • @dandavis Thanks! It's not *that* much complexity. Well, `getRandomValues` is what I have already proposed. So that's a third of the complexity. Then you seem to be saying that I don't need another RNG (that will be seeded with gRV) but I can just use gRV as it is. So that would save one third of the complexity. You then go on saying that I don't even need a shuffling algorithm like Fisher-Yates, which is not something I did expect. So will this really give a uniform distribution? I don't have the competence to judge. By the way, why 100 bytes instead of 29 that I thought would be enough? – caw Nov 09 '17 at 17:15
  • 1
    yes, the disto will be perfect, even for slot 0 and 51, if you follow the above routine. i do a lot of random stuff, my github is rndme, and i know it's easy to have rounding errors, so i like to keep it simple. the reason for over-drawing is that you discard more and more "cards" as you run out of un-used combos. this is not the most CPU efficient way, but it's plenty fast for most tasks/games. – dandavis Nov 10 '17 at 00:28
  • @dandavis Thank you! Having thought about your solution again, of course it is correct. You suggest the same way of generating random numbers as @squeamishossifrage did in their answer, i.e. using `crypto.getRandomValues` as the best available source of randomness, plus rejection sampling to get values in the correct range without adding bias. But the way you suggest to use the random numbers simply seems to be a method inferior to Fisher-Yates, which has been proven to be effective and efficient. It's just as simple. Nevertheless, your answer has been very helpful for the random numbers! – caw Nov 12 '17 at 07:22
  • 1
    glad to help. when people get started with crypto i strongly push them to error on the side of simplicity. consider the subtle bugs in the answer as a prime example of why. in fact, [the answer still seems consistently 2% off](https://jsfiddle.net/oty8x74m/1/), which is concerning. – dandavis Nov 12 '17 at 07:45
  • [here's a way to do it with perfect distro](https://jsfiddle.net/xzk8574k/), and without a lot of complexity, and with <1ppm error. – dandavis Nov 12 '17 at 07:55
  • @dandavis First, totally agree with you on the importance of simplicity. It's so easy to get at least *one* small error in your algorithm, e.g. an off-by-one error. Second, thank you so much for doing the test on the answer by @squeamishossifrage. That error is indeed caused by … an off-by-one error. The condition `while (j >= i)` must be changed to `while (j > i)` for Fisher-Yates, as I have pointed out in the comments to the answer. Fixing that, we have a correct solution. Third, your solution does not do what you proposed earlier (e.g. mod 64, then rejection sampling). Might introduce bias. – caw Nov 12 '17 at 19:49
  • yes, there's a bias on mine: `51 / (2**32)`. you could reject if that's too much, but imho, even w/o it's a good trade-off of simplicity and performance with perfection. – dandavis Nov 12 '17 at 23:48
  • While the technical aspects of this question intrigue me, I must state this - if you are planning on implementing card shuffling on **client** (ie. browser), even a perfect shuffle algo will not help you make the game secure. It's done on client -> it cannot be trusted. If there is money involved do the whole thing on the server (where you can have true RNG even in JavaScript, through Node's `crypto.randomBytes()`). – Robert Rossmann Nov 16 '17 at 07:28
  • 1
    @RobertRossmann Thanks, I’m absolutely aware of this, but this comment is definitely helpful in the general context of this discussion. I would only ever use this in cases where the client can be *trusted*, i.e. where that same client who may forge the results is the only party that will ever use the results (e.g. single-player games). – caw Nov 16 '17 at 18:45

3 Answers3

13

Here's a function I wrote that uses Fisher-Yates shuffling based on random bytes sourced from window.crypto. Since Fisher-Yates requires that random numbers are generated over varying ranges, it starts out with a 6-bit mask (mask=0x3f), but gradually reduces the number of bits in this mask as the required range gets smaller (i.e., whenever i is a power of 2).

function shuffledeck() {
    var cards = Array("A♣️","2♣️","3♣️","4♣️","5♣️","6♣️","7♣️","8♣️","9♣️","10♣️","J♣️","Q♣️","K♣️",
                      "A♦️","2♦️","3♦️","4♦️","5♦️","6♦️","7♦️","8♦️","9♦️","10♦️","J♦️","Q♦️","K♦️",
                      "A♥️","2♥️","3♥️","4♥️","5♥️","6♥️","7♥️","8♥️","9♥️","10♥️","J♥️","Q♥️","K♥️",
                      "A♠️","2♠️","3♠️","4♠️","5♠️","6♠️","7♠️","8♠️","9♠️","10♠️","J♠️","Q♠️","K♠️");
    var rndbytes = new Uint8Array(100);
    var i, j, r=100, tmp, mask=0x3f;

    /* Fisher-Yates shuffle, using uniform random values from window.crypto */
    for (i=51; i>0; i--) {
        if ((i & (i+1)) == 0) mask >>= 1;
        do {
            /* Fetch random values in 100-byte blocks. (We probably only need to do */
            /* this once.) The `mask` variable extracts the required number of bits */
            /* for efficient discarding of random numbers that are too large. */
            if (r == 100) {
                window.crypto.getRandomValues(rndbytes);
                r = 0;
            }
            j = rndbytes[r++] & mask;
        } while (j > i);

        /* Swap cards[i] and cards[j] */
        tmp = cards[i];
        cards[i] = cards[j];
        cards[j] = tmp;
    }
    return cards;
}

An assessment of window.crypto libraries really deserves its own question, but anyway...

The pseudorandom stream provided by window.crypto.getRandomValues() should be sufficiently random for any purpose, but is generated by different mechanisms in different browsers. According to a 2013 survey:

  • Firefox (v. 21+) uses NIST SP 800-90 with a 440-bit seed. Note: This standard was updated in 2015 to remove the (possibly backdoored) Dual_EC_DRBG elliptic curve PRNG algorithm.

  • Internet Explorer (v. 11+) uses one of the algorithms supported by BCryptGenRandom (seed length = ?)

  • Safari, Chrome and Opera use an ARC4 stream cipher with a 1024-bit seed.


Edit:

A cleaner solution would be to add a generic shuffle() method to Javascript's array prototype:

// Add Fisher-Yates shuffle method to Javascript's Array type, using
// window.crypto.getRandomValues as a source of randomness.

if (Uint8Array && window.crypto && window.crypto.getRandomValues) {
    Array.prototype.shuffle = function() {
        var n = this.length;
    
        // If array has <2 items, there is nothing to do
        if (n < 2) return this;
        // Reject arrays with >= 2**31 items
        if (n > 0x7fffffff) throw "ArrayTooLong";
    
        var i, j, r=n*2, tmp, mask;
        // Fetch (2*length) random values
        var rnd_words = new Uint32Array(r);
        // Create a mask to filter these values
        for (i=n, mask=0; i; i>>=1) mask = (mask << 1) | 1;
    
        // Perform Fisher-Yates shuffle
        for (i=n-1; i>0; i--) {
            if ((i & (i+1)) == 0) mask >>= 1;
            do {
                if (r == n*2) {
                    // Refresh random values if all used up
                    window.crypto.getRandomValues(rnd_words);
                    r = 0;
                }
                j = rnd_words[r++] & mask;
            } while (j > i);
            tmp = this[i];
            this[i] = this[j];
            this[j] = tmp;
        }
        return this;
    }
} else throw "Unsupported";

// Example:
deck = [ "A♣️","2♣️","3♣️","4♣️","5♣️","6♣️","7♣️","8♣️","9♣️","10♣️","J♣️","Q♣️","K♣️",
         "A♦️","2♦️","3♦️","4♦️","5♦️","6♦️","7♦️","8♦️","9♦️","10♦️","J♦️","Q♦️","K♦️",
         "A♥️","2♥️","3♥️","4♥️","5♥️","6♥️","7♥️","8♥️","9♥️","10♥️","J♥️","Q♥️","K♥️",
         "A♠️","2♠️","3♠️","4♠️","5♠️","6♠️","7♠️","8♠️","9♠️","10♠️","J♠️","Q♠️","K♠️"];

deck.shuffle();
Community
  • 1
  • 1
r3mainer
  • 23,981
  • 3
  • 51
  • 88
  • Thanks a lot! Essentially, the contents of your inner `do { ... } while (j >= i);` loop are the replacement for `j = Math.floor(Math.random() * (i + 1));` from the "default" Fisher-Yates implementation that uses `Math.random`, right? The extra stuff uses `window.crypto.getRandomValues` to generate a random integer in a defined range, which is what we wanted. You do so by generating a random number from a random byte until you find one that is in the required range, optimizing this by discarding bits that are never needed. Is that all correct? – caw Nov 09 '17 at 18:53
  • Yes, that's right. Scaling `Math.random()` to the required range is another way of doing it, but the resulting distribution ends up being slightly non-uniform. For the best results, you need to generate a uniform random number with as many binary digits as the upper limit of your range, then discard the value if it's too large. If you have to discard too many numbers, you might need to refresh `rndbytes`, but this is quite unlikely. – r3mainer Nov 09 '17 at 19:03
  • 1
    Thanks! So you just generate unbiased random bytes and reject those that don't fit so as not to introduce bias. The buffer size of 100 is because you have to discard every second byte on average. The `j` being positive is enforced by the usage of `Uint8Array` instead of `Int8Array`. Is that all correct? But don't you have to change `while (j >= i)` to `while (j > i)` for Fisher-Yates because the range is `[0, i]` and not `[0, i)`? Doesn't all this boil down to using standard Fisher-Yates with an RNG built on `window.crypto.getRandomValues` that generates integers in a specified range? – caw Nov 09 '17 at 22:23
  • 1
    By the way, `Math.floor(Math.random() * (max - min)) + min` seems to have an equally uniform distribution but the point is that we need more entropy and more internal state for the RNG than `Math.random` can offer, right? Although we only need numbers in a pretty small range, the internal state of the RNG is important because of the cycle length. Chrome's implementation of `Math.random` seems to have [128 bits](https://v8project.blogspot.de/2015/12/theres-mathrandom-and-then-theres.html) of state. With `getRandomValues`, we could get the 226 bits that we really need. – caw Nov 09 '17 at 22:49
  • 1
    Yikes, you found a flaw in my code. I will fix that when I find the time. The average discard rate is a bit lower — about one in 3 on average, I think. Every second byte is worst-case. Using `UInt8` because we're only interested in lower bits. Apart from the lower entropy, `Math.random() * range` isn't totally uniform because it stretches 2^32 values (or possibly 2^31 — not sure) across a range in which the number of intervals isn't necessarily a whole power of 2. – r3mainer Nov 09 '17 at 23:06
  • 1
    No problem. Thanks! Though you didn't change the `while (j >= i)` to `while (j > i)` now but instead changed how (and where) the `mask >>= 1` bit shift is done. Is this correct? – caw Nov 12 '17 at 00:05
  • the answer still seems consistently 2% off: https://jsfiddle.net/oty8x74m/1/ should be close to 1 on the bottom number, instead it's close to 1.02, which means the left half weighs more than the right half. if they were truly random, the halves would weigh (about) the same. 2% is actually a lot, and in blackjack would enable a player to rob the house. – dandavis Nov 12 '17 at 08:10
  • 1
    @dandavis See my comment right above yours for the reason. – caw Nov 12 '17 at 20:35
  • Thanks for the feedback. Clearly this is not as straightforward as I thought. I'm pretty sure it's working properly now, but perhaps it would be a good idea *not* to take my word for it! – r3mainer Nov 12 '17 at 21:50
1

Combining this answer from here with this answer from another question, it seems the following could be a more general and modular (though less optimized) version:

// Fisher-Yates
function shuffle(array) {
    var i, j;

    for (i = array.length - 1; i > 0; i--) {
        j = randomInt(0, i + 1);
        swap(array, i, j);
    }
}

// replacement for:
//     Math.floor(Math.random() * (max - min)) + min
function randomInt(min, max) {
    var range = max - min;
    var bytesNeeded = Math.ceil(Math.log2(range) / 8);
    var randomBytes = new Uint8Array(bytesNeeded);
    var maximumRange = Math.pow(Math.pow(2, 8), bytesNeeded);
    var extendedRange = Math.floor(maximumRange / range) * range;
    var i, randomInteger;

    while (true) {
        window.crypto.getRandomValues(randomBytes);
        randomInteger = 0;

        for (i = 0; i < bytesNeeded; i++) {
            randomInteger <<= 8;
            randomInteger += randomBytes[i];
        }

        if (randomInteger < extendedRange) {
            randomInteger %= range;

            return min + randomInteger;
        }
    }
}

function swap(array, first, second) {
    var temp;

    temp = array[first];
    array[first] = array[second];
    array[second] = temp;
}
caw
  • 30,999
  • 61
  • 181
  • 291
0

I personally think you could move outside the box a little bit. If you're that worried about randomness, you could look into an API key from random.org ( https://api.random.org/json-rpc/1/ ) or parse it out of a link like this: https://www.random.org/integer-sets/?sets=1&num=52&min=1&max=52&seqnos=on&commas=on&order=index&format=html&rnd=new .

Sure, your datasets could be intercepted, but if you get a few hundred thousand sets of them then shuffle those sets you would be fine.

Dylan Brams
  • 2,089
  • 1
  • 19
  • 36