30

I have two questions regarding implementation of Random class in .NET Framework 4.6 (code available here):

  1. What is the rationale for setting Seed argument to 1 at the end of the constructor? It seems to be copy-pasted from Numerical Recipes in C (2nd Ed.) where it made some sense, but it doesn't have any in C#.

  2. It is directly stated in the book (Numerical Recipes in C (2nd Ed.)) that inextp field is set to value 31 because:

The constant 31 is special; see Knuth.

However, in the .NET implementation this field is set to value 21. Why? The rest of a code seems to closely follow the code from book except for this detail.

BartoszKP
  • 34,786
  • 15
  • 102
  • 130
houen
  • 931
  • 9
  • 18
  • 6
    Where's @EricLippert if you need him? – Tim Schmelter Aug 10 '15 at 09:54
  • 1
    *However, in the .NET implementation this field is set to value 21. Why?* They just picked a random number ;) – Patrick Hofman Aug 10 '15 at 09:54
  • 2
    Moreover, there's a constant MZ which is not used at all, instead the value `0` is simply used in those places. – konrad.kruczynski Aug 10 '15 at 09:57
  • Just to note, apparently the `31` is special because it's the difference between the two coefficients - 55 and 24. Using 21 instead seems to mean that the .NET implementation has a shorter periodicity than the original 31. As for setting the `Seed`, this indeed has no effect in C#. – Luaan Aug 10 '15 at 10:07

2 Answers2

23

Regarding the intexp issue, this is a bug, one which Microsoft has acknowledged and refused to fix due to backwards compatibility concerns.

Indeed, you have discovered a genuine problem with the Random implementation. We have discussed it within the team and with some of our partners and concluded that we unfortunately cannot fix the problem right now. The reason is that some applications rely on the fact that when initialised with the same seed, the generator produces the same pseudo random sequence. Even if the change is for the better, it will break the applications that made this assumption once they have migrated to the “fixed” version.

DGibbs
  • 14,316
  • 7
  • 44
  • 83
  • I don't understand why the generator no longer produces the same pseudo random sequence with the same seed if you set `intexp` to `31` instead of `21`. – Tim Schmelter Aug 10 '15 at 10:04
  • 5
    The algoritm keeps a buffer of 56 random integers which are initialized based on the seed value. Two indices in the buffer should be kept 31 positions apart (21 in this case), the pseudo random number is calculated as the difference of the values at these indices. You can't change the position of the indices without changing the output since they would reference different values. – DGibbs Aug 10 '15 at 10:19
  • 1
    @TimSchmelter: with the new `31` implementation it will produce a sequence that is different from the `21` implementation. To simplify extremely, Random.next() is a function of the seed and that 31 or 21 value. change the value, change the behavior of `next`. – njzk2 Aug 10 '15 at 14:37
  • @njzk2: So it still produces the same sequence with the same seed but it produces a different sequence compared to the old implementation? Ok, then it would break existing systems which rely on it. Thanks. – Tim Schmelter Aug 10 '15 at 14:42
  • @TimSchmelter exactly. – njzk2 Aug 10 '15 at 14:44
  • Is this still an issue with .Net Core? – Richard Ward Oct 01 '19 at 22:35
7

For some more context:

A while back I fully analysed this implementation. I found a few differences.

A the first one (perfectly fine) is a different large value (MBIG). Numerical Recipies claims that Knuth makes it clear that any large value should work, so that is not an issue, and Microsoft reasonably chose to use the largest value of a 32 bit integer.

The second one was that constant, you mentioned. That one is a big deal. In the minimum it will substantially decrease period. There have been reports that the effects are actually worse than that.

But then comes one other particularly nasty difference. It is literally guarenteed to bias the output (since it does so directly), and will also likely affect the period of the RNG.

So what is this second issue? When .NET first came out, Microsoft did not realize that the RNG they coded was inclusive at both ends, and they documented it as exclusive at the maximum end. To fix this, the security team added a rather evil line of code: if (retVal == MBIG) retVal--;. This is very unfortunately as the correct fix would literally be only 4 added characters (plus whitespace).

The correct fix would have been to change MBIG to int.MaxValue-1, but switch Sample() to use MBIG+1 (i.e. to keep using int.MaxValue). That would guarantee the that Sample has the range [0.0, 1.0) without introducing any bias, and only changes the value of MBIG which Numerical Recipies said Knuth said is perfectly fine.

Kevin Cathcart
  • 9,838
  • 2
  • 36
  • 32