0

Imagine we have two independent pseudo-random number generators using same algorithm but seeded differently. And we are generating numbers of same size using these generators, say 32-bit integers. Provided algorithm gives us uniform distribution, there is 1/2^32 probability (or is it?) of a collision. If a collision just happened, what is the probability the very next pair will also be a collision? It seems for me this probability might be different (higher) from that initial uniform-based collision chance. Most of currently existing pseudo-random number generators hold internal state to maintain own stability, and recently happened collision might signal those internal states are somewhat "entangled" giving modified (higher) chance of a collision to happen again.

The question is probably too broad to give any precise answer, but revealing general directions/trends could also be nice. Here are some interesting aspects:

  • Does size of initial collision matter? Is there a difference after a collision of 8 consecutive bits vs 64 bits? How approximately chance of next collision depends on size of generated sequence?

  • Does pattern of pair generation matter? For example, we could find initial collision by executing first generator only once and "searching" second generator. Or we could invoke each generator on every iteration.

  • I'm particularly interested in default javascript Math.random(). 32-bit integers can be generated of that like this (for example). EDIT: As pointed in comments, conversion of random value from [0; 1) range should be done carefully, as exponent of such values is very likely to repeat (and it takes decent part of result extracted this way).

Vasily Liaskovsky
  • 2,248
  • 1
  • 17
  • 32
  • 1
    The answer is highly dependent on the details of the actual generator. However, for most "good" generators with reasonable internal state the probability of a "collision" in your scenario will be 1/2^32 as you conjectured and the probability of the next collision will also be 1/2^32, and so on. I don't know what javascript uses. – President James K. Polk Jan 28 '22 at 00:03
  • using your linked `ArrayBuffer` answer, one number will not be uniform. the most significant 12 bits (i.e. the sign + 11 exponent bits) of values returned from `Math.random()` will always be constant. hence one number will only have 20 random bits, while the other will likely be uniform. where these bits end up is presumably endian dependent, so little endian processors (e.g. x86) will likely end up with the first number being uniform and the second only being able to represent 0.02% of 32bit values – Sam Mason Jan 28 '22 at 14:14
  • Good point, Sam. I guess this is just a poor example of how to convert floating point number to 32-bit integer, we should do it more carefully. Provided we finally can generate bit sequence without such caveats, the main question is still here. – Vasily Liaskovsky Jan 28 '22 at 19:05
  • For the V8 Chrome / Edge / Node.js engine, the [Math.random source](https://chromium.googlesource.com/v8/v8/+/refs/heads/main/src/numbers/math-random.cc) references the [base::RandomNumberGenerator](https://chromium.googlesource.com/v8/v8/+/refs/heads/main/src/base/utils/random-number-generator.cc) source which seeds entropy from the system time or high resolution time, and appears to employ the [XorShift128 algorithm](https://en.wikipedia.org/wiki/Xorshift)... – Trentium Jan 28 '22 at 21:16
  • @Trentium chrome on most systems will use the kernel for entropy (e.g. urandom on unix), only falling back to using time if this fails – Sam Mason Jan 29 '22 at 12:07
  • @VasilyLiaskovsky what's wrong with `Math.floor(Math.random() * 2**32)`? current versions of Firefox/Chrome have reasonable prngs and hence the first comment should hold. hence the probability of seeing repeats would be `2**-32` as you'd expect – Sam Mason Jan 29 '22 at 12:12

0 Answers0