0

I have a loop of VB6 code which executes 1,000,000 plus times. Each time the loop is executed a 32 bit random number is generated. Processing time for each loop is about 250 loops per second. Problem is I am ending up with about 30,000 duplicate numbers. My understanding is that the Rnd function uses the system elapsed milliseconds from system start. That should mean that the system "seed" has changed with each loop, but still getting duplicates.
Example:

for i = 1 to 1000000  
    'do a bunch of code  

    'get a 32 bit random number using Rnd twice in a function with   
    'a Randomize statement before each Rnd   

    'do another bunch of code  
next i   

Any ideas?
Thanks

GSerg
  • 76,472
  • 17
  • 159
  • 346
Doug Jones
  • 11
  • 2
  • 4
    Generating a random number does not remove it from the pool of random numbers. In other words, it's possible _and likely_ that some numbers will be duplicates. This isn't a bug; it's how unpredictability works. If you want unique numbers, you need to enforce this yourself by discarding duplicates. – cdhowie Aug 16 '19 at 17:09
  • 6
    I'm going to guess that the problem is because you are calling randomize in the loop. If you constantly reseed, and use the same seed multiple times, that may generate large #s of duplicates. The loop will run so fast that you use the same seed time multiple times. Just call randomize once, before the loop - does that help? – StayOnTarget Aug 16 '19 at 17:35
  • 2
    [This answer](https://stackoverflow.com/a/41104557/4996248) (to a question I once asked) has a nice graphic which illustrates some of the odd things which can happen if you call `Randomize` in a loop. – John Coleman Aug 16 '19 at 19:37
  • Think about it this way: suppose instead of choosing a million 32 bit numbers, you chose four 3 bit numbers -- that is, between 0 and 7. What is the probability of getting a duplicate? It's actually easier to work out the probability of getting no duplicate. If you do the math, you will see that duplicates are *extremely common*. 30000 duplicates amongst a million numbers drawn from only four billion possibilities is about right. If you want no duplicates, draw them from 128 bit numbers; there are enough of those that you will not get duplicates. – Eric Lippert Aug 19 '19 at 17:24

1 Answers1

2

The intended use of Randomize and Rnd is that you call Randomize once at the start of the program to initialize the random number generator, and then call Rnd for the sequence of random numbers you need. Calling Randomize before each use of Rnd is counterproductive, since you're reinitializing the generator each time you call it.

Note that you're still likely to get some amount of duplicate numbers just because that's how randomness works. See Eric Lippert's article on "Socks, birthdays, and hash collisions" that your chance of at least one collision is pretty close to 100% if you're drawing 1 million numbers from 232 choices.

Also, be aware that Rnd doesn't give you crypto-strength random numbers, so if you need that then you need to make calls to the Windows Crypto API or otherwise find a better source.