13

What is the correct way to generate exact value from 0 to 999999 randomly since 1000000 is not a power of 2?

This is my approach:

  1. use crypto.randomBytes to generate 3 bytes and convert to hex
  2. use the first 5 characters to convert to integer (max is fffff == 1048575 > 999999)
  3. if the result > 999999, start from step 1 again

It will somehow create a recursive function. Is it logically correct and will it cause a concern of performance?

wdetac
  • 2,712
  • 3
  • 20
  • 42
  • 2
    Math.floor(Math.random(1000000)*1000000) – skitty jelly Jul 13 '18 at 12:48
  • 2
    I know, but `math.random` is not cryptographically secure. And my question is intended to use `crypto.randomBytes` – wdetac Jul 13 '18 at 12:50
  • You could calculate the fraction `YourRandInt/MaxPossible` and multiply that with your intended range (999.999), then round to the next integer. Not sure though whether that would yield certain numbers more often. – Tobias K. Jul 13 '18 at 13:20
  • @TobiasK Thanks, however I think this method do the same duplicate as `YourRandInt % 1000000` – wdetac Jul 13 '18 at 16:47
  • Using modulo has x2 the chance for the values from 0 to 48575 , I think exactly this would not happen using this method, as every `YourRandInt` gets a unique fraction between 0 and 1. The error I think that may happen is that you end up with e.g. 1.1 1.5 1.9 2.3 2.7 3.1 3.5 and due to rounding get `2` 3x and `3` only 2x. I'm neither a mathematician nor a crypto specialist though, it was just an idea I had, as generally I like the `Math.random()*range` idea, and you'd just have to securely generate the fraction. – Tobias K. Jul 13 '18 at 17:59
  • @skittyjelly besides the fact that it's `Math.random()` isn't *really* random, there is a high possibility that it generates numbers *less* than 6 digits, and padding it with 0 just isn't a good option. The best bet with `Math.random` would be along the lines of `const numbers = (() => { const r = () => (Math.random() * 10) | 0; return () => "" + r() + r() + r() + r() + r() + r(); })();`. – code Mar 16 '22 at 23:37

5 Answers5

10

There are several way to extract random numbers in a range from random bits. Some common ones are described in NIST Special Publication 800-90A revision 1: Recommendation for Random Number Generation Using Deterministic Random Bit Generators

Although this standard is about deterministic random bit generations there is a helpful appendix called A.5 Converting Random Bits into a Random Number which describes three useful methods.

The methods described are:

  • A.5.1 The Simple Discard Method
  • A.5.2 The Complex Discard Method
  • A.5.3 The Simple Modular Method

The first two of them are not deterministic with regards to running time but generate a number with no bias at all. They are based on rejection sampling.

The complex discard method discusses a more optimal scheme for generating large quantities of random numbers in a range. I think it is too complex for almost any normal use; I would look at the Optimized Simple Discard method described below if you require additional efficiency instead.

The Simple Modular Method is time constant and deterministic but has non-zero (but negligible) bias. It requires a relatively large amount of additional randomness to achieve the negligible bias though; basically to have a bias of one out of 2^128 you need 128 bits on top of the bit size of the range required. This is probably not the method to choose for smaller numbers.

Your algorithm is clearly a version of the Simple Discard Method (more generally called "rejection sampling"), so it is fine.


I've myself thought of a very efficient algorithm based on the Simple Discard Method called the "Optimized Simple Discard Method" or RNG-BC where "BC" stands for "binary compare". It is based on the observation that comparison only looks at the most significant bits, which means that the least significant bits should still be considered random and can therefore be reused. Beware that this method has not been officially peer reviewed; I do present an informal proof of equivalence with the Simple Discard Method.


Of course you should rather use a generic method that is efficient given any value of N. In that case the Complex Discard Method or Simple Modular Method should be considered over the Simple Discard Method. There are other, much more complex algorithms that are even more efficient, but generally you're fine when using either of these two.

Note that it is often beneficial to first check if N is a power of two when generating a random in the range [0, N). If N is a power of two then there is no need to use any of these possibly expensive computations; just use the bits you need from the random bit or byte generator.

Maarten Bodewes
  • 90,524
  • 13
  • 150
  • 263
  • Note that you should never have to use hexadecimals within calculations. Hexadecimals are just human readable *representations* of the bits and bytes. All calculations should preferably be performed on the bits and bytes themselves. – Maarten Bodewes Jul 14 '18 at 12:29
  • 1
    Thanks for the great article. In my case, for the complex discard method, `999999^1/(2^20-1) ~ 0.95`, `999999^2/(2^40-1) ~ 0.91` ... so the best case is still t = 1`. If I use the simple modular method, I will need to use 70 bits for one random number!? It sounds much less efficient! Can you also explain more on the bitwise calculation? – wdetac Jul 16 '18 at 04:20
  • I know some of the bitwise operator like `shifts` and logical `and` `or`, but how they related to the algorithm? – wdetac Jul 16 '18 at 04:25
  • In your case you are lucky and the simple discard method works well. For other numbers it may require regeneration for up to 50 percent of the cases, in particular 0 to 2^X where the first bit always needs to be 0. That particular nasty can be avoided by simply taking X - 1 bits. – Maarten Bodewes Jul 16 '18 at 09:45
4

It's a correct algorithm (https://en.wikipedia.org/wiki/Rejection_sampling), though you could consider using bitwise operations instead of converting to hex. It can run forever if the random number generator is malfunctioning -- you could consider trying a fixed number of times and then throwing an exception instead of looping forever.

David Eisenstat
  • 64,237
  • 7
  • 60
  • 120
2

The main possible performance problem is that on some platforms, crypto.randomBytes can block if it runs out of entropy. So you don't want to waste any randomness if you're using it.

Therefore instead of your string comparison I would use the following integer operation.

if (random_bytes < 16700000) {
    return random_bytes = random_bytes - 100000 * Math.floor(random_bytes/100000);
}

This has about a 99.54% chance of producing an answer from the first 3 bytes, as opposed to around 76% odds with your approach.

btilly
  • 43,296
  • 3
  • 59
  • 88
  • @wdetac Oops, I was wrong. According to https://github.com/nodejs/node-v0.x-archive/issues/6372 it can block, but it use /dev/urandom under the hood which only blocks near system startup, and not /dev/random where you can use up entropy faster than you generate it. So using fewer random bytes is not an issue. But if you're using a strong cryptographic source at high speed, you want to use every ounce of randomness that you can. So make guesses in a way where you will if you can possibly. – btilly Jul 13 '18 at 17:07
2

I would suggest the following approach:

private generateCode(): string {
    let code: string = "";

    do {
        code += randomBytes(3).readUIntBE(0, 3);
        // code += Number.parseInt(randomBytes(3).toString("hex"), 16);
    } while (code.length < 6);

    return code.slice(0, 6);
}

This returns the numeric code as string, but if it is necessary to get it as a number, then change to return Number.parseInt(code.slice(0, 6))

trincot
  • 317,000
  • 35
  • 244
  • 286
alexojegu
  • 754
  • 7
  • 21
-1

I call it the random_6d algo. Worst case just a single additional loop.

var random_6d = function(n2){
    var n1 = crypto.randomBytes(3).readUIntLE(0, 3) >>> 4;

    if(n1 < 1000000)
        return n1;

    if(typeof n2 === 'undefined')
        return random_6d(n1);

    return Math.abs(n1 - n2);
};

loop version:

var random_6d = function(){
    var n1, n2;

    while(true){
        n1 = crypto.randomBytes(3).readUIntLE(0, 3) >>> 4;

        if(n1 < 1000000)
            return n1;

        if(typeof n2 === 'undefined')
            n2 = n1;
        else
            return Math.abs(n1 - n2);
    };
};