Well, you definitely need RNG with 226bits period for all permutation to be covered, @PeterO answer is correct in this regard. But you could extend period using Bays-Durham shuffle, paying by effectively extending state of RNG. There is an estimate of the period of the B-D shuffled RNG and it is
P = sqrt(Pi * N! / (2*O))
where Pi=3.1415..., N
is B-D table size, O
is period of the original generator. If you take log2 of the whole expression, and use Stirling formula for factorial, and assume P=2226 and O=2128, you could get estimate for N, size of the table in B-D algorithm. From back-of-the envelope calculation N=64 would be enough to get all your permutations.
UPDATE
Ok, here is an example implementation of RNG extended with B-D shuffle. First, I implemented in Javascript Xorshift128+, using BigInt, which is apparently default RNG in V8 engine as well. Compared with C++ one, they produced identical output for first couple of dozen calls. 128bits seed as two 64bits words. Windows 10 x64, NodeJS 12.7.
const WIDTH = 2n ** 64n;
const MASK = WIDTH - 1n; // to keep things as 64bit values
class XorShift128Plus { // as described in https://v8.dev/blog/math-random
_state0 = 0n;
_state1 = 0n;
constructor(seed0, seed1) { // 128bit seed as 2 64bit values
this._state0 = BigInt(seed0) & MASK;
this._state1 = BigInt(seed1) & MASK;
if (this._state0 <= 0n)
throw new Error('seed 0 non-positive');
if (this._state1 <= 0n)
throw new Error('seed 1 non-positive');
}
next() {
let s1 = this._state0;
let s0 = this._state1;
this._state0 = s0;
s1 = ((s1 << 23n) ^ s1 ) & MASK;
s1 ^= (s1 >> 17n);
s1 ^= s0;
s1 ^= (s0 >> 26n);
this._state1 = s1;
return (this._state0 + this._state1) & MASK; // modulo WIDTH
}
}
Ok, then on top of XorShift128+ I've implemented B-D shuffle, with table of size 4. For your purpose you'll need table more than 84 entries, and power of two table is much easier to deal with, so let's say 128 entries table (7bit index) shall be good enough. Anyway, even with 4 entries table and 2bit index we need to know which bits to pick to form index. In original paper B-D discussed picking them from the back of rv as well as from front of rv etc. Here is where B-D shuffle needs another seed value - telling algorithm to pick say, bits from position 2 and 6.
class B_D_XSP {
_xsprng;
_seedBD = 0n;
_pos0 = 0n;
_pos1 = 0n;
_t; // B-D table, 4 entries
_Z = 0n;
constructor(seed0, seed1, seed2) { // note third seed for the B-D shuffle
this._xsprng = new XorShift128Plus(seed0, seed1);
this._seedBD = BigInt(seed2) & MASK;
if (this._seedBD <= 0n)
throw new Error('B-D seed non-positive');
this._pos0 = findPosition(this._seedBD); // first non-zero bit position
this._pos1 = findPosition(this._seedBD & (~(1n << this._pos0))); // second non-zero bit position
// filling up table and B-D shuffler
this._t = new Array(this._xsprng.next(), this._xsprng.next(), this._xsprng.next(), this._xsprng.next());
this._Z = this._xsprng.next();
}
index(rv) { // bit at first position plus 2*bit at second position
let idx = ((rv >> this._pos0) & 1n) + (((rv >> this._pos1) & 1n) << 1n);
return idx;
}
next() {
let retval = this._Z;
let j = this.index(this._Z);
this._Z = this._t[j];
this._t[j] = this._xsprng.next();
return retval;
}
}
Use example is as follow.
let rng = new B_D_XSP(1, 2, 4+64); // bits at second and sixth position to make index
console.log(rng._pos0.toString(10));
console.log(rng._pos1.toString(10));
console.log(rng.next());
console.log(rng.next());
console.log(rng.next());
Obviously, third seed value of say 8+128 would produce different permutation from what is shown in the example, you could play with it.
Last step would be to make 226bit random value by calling several (3 of 4) times B-D shuffled rng and combine 64bit values (and potential carry over) to make 226 random bits and then convert them to the deck shuffle.