Math.NET CryptoRandomSource Next is Biased

Question

I am developing a gaming platform that is subject to heavy regulatory scrutiny. I chose Math.NET because it seemed like a good fit. However I have just received this comment back from our auditors.

Comments please if this is accurate and how it can be resolved?

In RandomSource(), Next(int, int) is defined as follows:

    public override sealed int Next(int minValue, int maxValue)
    {
        if (minValue > maxValue)
        {
            throw new ArgumentException(Resources.ArgumentMinValueGreaterThanMaxValue);
        }

        if (_threadSafe)
        {
            lock (_lock)
            {
                return (int)(DoSample()*(maxValue - minValue)) + minValue;
            }
        }

        return (int)(DoSample()*(maxValue - minValue)) + minValue;
    }

This creates a bias in the same way as before. Using an un-scaled value from the RNG and multiplying it by the range without previously eliminating the bias (unless the range is a power of 2, there will be a bias ).

What is `DoSample()`? Without knowing exactly what that returns its hard to work out what is happening here. Is it just a uniformaly distributed random number between 0 and 1 (and which of the end points does it include/exclude)? Do you know for a fact that this is actually unbiased? — Chris, Aug 18 '16 at 08:42
Also what is the expected return value of the method? Should it be able to return maxValue or minValue? Also have you done basic testing of calling something like `Next(0,1)` millions (or billions) of times and tallying the results to see if they are significantly different (significance can sometimes be hard to work out when they are close but if they are orders of magnitude different for large sample size then you can be pretty sure something is wrong, especially if the result is repeatable. — Chris, Aug 18 '16 at 08:50
DoSample() method will return a double value between 0 and 1. There are no issues with this code. The bit I've quoted is what the auditors have a problem with, for when the range is not a power of 2 Unfortunately all our ranges are 0-99, 0-999, 0-9999, 0-99999. — IntoNET, Aug 18 '16 at 08:56
What version of Math.NET Numerics are you using? The implementation of recent versions are slightly different, i.e. "maxValue" is clarified as meaning "maxExclusive" and it will never generate this exclusive value. — Christoph Rüegg, Aug 18 '16 at 08:59
Actually, there was a recent fix in this area that could have caused it to generate it, which has been committed two weeks ago but not released yet - I intend to release a new version (3.13) including the fix in a couple days. — Christoph Rüegg, Aug 18 '16 at 09:04
@ChristophRüegg thanks for the response. I am using 3.12.0. from NuGet. Are you saying this issue is fixed in 3.13? — IntoNET, Aug 18 '16 at 09:14
The issue that maxExclusive can technically be generated (which a chance of 1:4'294'967'295) will be fixed in 3.13, yes. I intend to release it today. — Christoph Rüegg, Aug 18 '16 at 09:19
@ChristophRüegg I have overlooked the maxValue is an exclusive value, so my code should be rng.Next(0,100) for a value 0-99, and you can provide some mathematical proof there is no bias? I really like your library and want to use it, but ultimately we have to prove to the gambling commission auditors that there is 100% no bias. — IntoNET, Aug 18 '16 at 09:21

Christoph Rüegg · Answer 1 · 2016-08-19T06:20:50.240

2

Update: The implementation of Next(minInclusive, maxExclusive) has been changed in Math.NET Numerics v3.13 following this discussion. Since v3.13 it is no longer involving floating point numbers, but instead samples integers with as many bits as needed to support the requested range (power of two) and rejects those outside of the actual range. This way it avoids adding any bias on top of the byte sampling itself (as provided e.g. by the crypto RNG)

Assumption: DoSample() returns a uniformly distributed sample in the range [0,1) (double precision floating point number).

Multiplying it with the range R=max-min will result in a uniformly distributed sample in the range [0,R). Casting this to an integer, which is essentially a floor, will result in a uniformly distributed discrete sample of one of 0,1,2,...,R-1. I don't see where the fact that R is even, odd, or a power of two may affect bias in this step.

A few runs to compute 100'000'000 samples also do not indicate obvious bias, but of course this is no proof:

var r = new CryptoRandomSource();
long[] h = new long[8];
for (int i = 0; i < 100000000; i++)
{
    h[r.Next(2,7)]++;
}

0
0
19996313
20001286
19998092
19998328
20005981
0

0
0
20000288
20002035
20006269
19994927
19996481
0

0 
0 
19998296 
19997777 
20001463 
20002759 
19999705 
0

edited Aug 19 '16 at 06:20

answered Aug 18 '16 at 09:11

Christoph Rüegg

4,626
1
20
34

3

The reason R being odd, even or anything else is a sort of pigeonhole argument. a double returned on [0,1) will have 2^52 possible distinct values (the mantissa is 52 bits). If you multiply by R=5 then you still have 2^52 distinct values but now spread over [0,5). You take the floor of that and you have 2^52 values that are now in the set {0,1,2,3,4}. So how many of those 2^52 values are mapped to each number? Clearly 2^52 is not divisible by 5 so there cannot be the same chance of getting each integer in that range. – Chris Aug 18 '16 at 09:34
2

To understand better where the flaw is look at a much smaller floating point number that only has three relevant bits. DoSample now returns 0, 1/8, 2/8, 3/8, 4/8, 5/8, 6/8, 7/8 (eight values). If we multiple these by five we get 0, 5/8, 10/8, 15/8, 20/8, 25/8, 30/8, 35/8. When floored this gives 0,0,1,1,2,3,3,4 - not uniform! – Chris Aug 18 '16 at 09:40
Given that my ranges are always known, for exmple 0-99, if I were to generate a number between 0-100 and then throw away and re-generate if 100 is returned, wouldn't that solve the power of 2 bias issue? – IntoNET Aug 18 '16 at 09:40
Thanks @Chris, yes, this is a valid point! Thanks for the clarification. – Christoph Rüegg Aug 18 '16 at 09:42
Related topic: http://stackoverflow.com/questions/11758809/what-is-the-optimal-algorithm-for-generating-an-unbiased-random-integer-within-a – Christoph Rüegg Aug 18 '16 at 09:43
@IntoNET: No offence but after that comment I really think you need to find a professional who understand these concepts and hire them. 100 is not a power of 2! But yes, I would imagine if you generated a number between 0-127 and threw away unwanted values and "rerolled" then that should be better. I am no expert though and wouldn't like to guarantee that it would pass close testing. – Chris Aug 18 '16 at 09:44
Duh, yes you're right. I've been staring at this problem for too long. :-\ – IntoNET Aug 18 '16 at 09:59
1

Update: I'm currently working on a bias-free implementation within Math.NET Numerics - essentially byte sampling (power of two) with rejection, not involving any floating point numbers. – Christoph Rüegg Aug 18 '16 at 10:40
Propsal for a bias-free implementation: https://github.com/mathnet/mathnet-numerics/commit/b559ca166e318584fdca8251ad46346f748ebddd (in case of CryptoRandomSource, DoSampleBytes is a direct call to the native implementation) – Christoph Rüegg Aug 18 '16 at 14:24
Released in v3.13. – Christoph Rüegg Aug 18 '16 at 16:47

score 0 · Answer 2 · answered Aug 18 '16 at 10:53

I've come up with this solution for a value between 0 and max inclusive. I'm no maths expert so comments welcome.

It seems to satisfy the regulatory spec I have which says

2b) If a particular random number selected is outside the range of equal distribution of re-scaling values, it is permissible to discard that random number and select the next in sequence for the purpose of re-scaling."

    private readonly CryptoRandomSource _random = new CryptoRandomSource();

    private int GetRandomNumber(int max)
    {
        int number;
        var nextPowerOfTwo = (int)Math.Pow(2, Math.Ceiling(Math.Log(max) / Math.Log(2)));

        do
        {
            // Note: 2nd param of Next is an *exclusive* value. Add 1 to satisfy this
            number = _random.Next(0, nextPowerOfTwo + 1);
        } while (number > max);

        return number;
    }

Math.NET CryptoRandomSource Next is Biased

2 Answers2