0

I was trying to simulate the 649 lottery draw using VBA subroutine. For the lottery draw, six balls will be selected by ball machine, in the beginning there are 49 balls, each with a 1/49 probability of being selected, and after the first ball was selected, the rest 48 balls will then each have a 1/48 probability of being selected, and so on.

There is no direct VBA function to generate random numbers such that the interval is not consecutive; for instance, the first number selected is 3, and for the second number selection, 3 will not be available! So computer has to choose from 1, 2, 4, ..., 49.

Below is a subroutine I wrote, basically I used Int((UBound(array) - 1 + 1) * Rnd + 1) to first generate random number between integer intervals, but I treat the random number only as index; for example, for the second number selection where I have the above 48 number left: 1, 2, 4, ..., 49, now if the random number is 3 (chosen from between 1 to 48), I actually get 4 for the second number selection because it's the 3rd in the list. And Rnd() provides draw from uniform distribution, so each number is equally likely. This is the method I use to get around.

Then I record all previous selected numbers into s1 to s6, and then make them non-repetitive in the subsequent number selection.

At last I sort using a quicksort algorithm found at VBA array sort function? with slight modification to the input array. And output results on an empty worksheet.

I also used Randomize to increase randomness. So everything seems good, I'm mimicking exactly the ball machine does: select the first number, then the second... and at last the sixth, without putting back (non-repetitive), the only difference I think would be ball machine is True random number, whereas VBA is Pseudo random number.

To my surprise, for 100,000 simulations, I used the Remove Duplicates and then 79994 duplicate values found and removed; 20006 unique values remain. Now I feel it is not reliable. How could most draws have duplicates? Tried many time but same thing, lots of duplicates. I'm not sure where has gone wrong, if something wrong with this design and logic, or it's just because Pseudo random number? Thank you all!

Here is my code:

Public k As Long

Sub RNG()

Dim NUMBER(), SELECTION(1 To 100000, 1 To 6)
Dim i As Integer, j As Integer, n As Integer
Dim s1 As Integer, s2 As Integer, s3 As Integer, s4 As Integer, s5 As Integer, s6 As Integer

For k = 1 To 100000
    Erase NUMBER
    ReDim NUMBER(1 To 49)
    For i = 1 To 49
        NUMBER(i) = i
    Next i

    For j = 1 To 6
        'generate random number as index and select number based on index
        Randomize
        random_number = Int((UBound(NUMBER) - 1 + 1) * Rnd + 1)
        SELECTION(k, j) = NUMBER(random_number)
        'record each selection
        Select Case j
            Case Is = 1
                s1 = SELECTION(k, j)
            Case Is = 2
                s2 = SELECTION(k, j)
            Case Is = 3
                s3 = SELECTION(k, j)
            Case Is = 4
                s4 = SELECTION(k, j)
            Case Is = 5
                s5 = SELECTION(k, j)
            Case Is = 6
                s6 = SELECTION(k, j)
        End Select

        'recreate number 1 to 49 by excluding already-selected numbers
        Erase NUMBER
        ReDim NUMBER(1 To 49 - j)

        n = 0
        For i = 1 To 49
            Select Case j
                Case Is = 1
                    If i <> s1 Then
                        n = n + 1
                        NUMBER(n) = i
                    End If

                Case Is = 2
                    If i <> s1 And i <> s2 Then
                        n = n + 1
                        NUMBER(n) = i
                    End If

                Case Is = 3
                    If i <> s1 And i <> s2 And i <> s3 Then
                        n = n + 1
                        NUMBER(n) = i
                    End If

                Case Is = 4
                    If i <> s1 And i <> s2 And i <> s3 And i <> s4 Then
                        n = n + 1
                        NUMBER(n) = i
                    End If

                Case Is = 5
                    If i <> s1 And i <> s2 And i <> s3 And i <> s4 And i <> s5 Then
                        n = n + 1
                        NUMBER(n) = i
                    End If

            End Select
        Next i
    Next j

    Call QuickSort(SELECTION, 1, 6)

Next k

Range("A1:F" & k - 1).Value = SELECTION

End Sub


Public Sub QuickSort(vArray As Variant, inLow As Long, inHi As Long)
'https://stackoverflow.com/questions/152319/vba-array-sort-function

  Dim pivot   As Variant
  Dim tmpSwap As Variant
  Dim tmpLow  As Long
  Dim tmpHi   As Long

  tmpLow = inLow
  tmpHi = inHi

  pivot = vArray(k, (inLow + inHi) \ 2)

  While (tmpLow <= tmpHi)
  While (vArray(k, tmpLow) < pivot And tmpLow < inHi)
    tmpLow = tmpLow + 1
  Wend

  While (pivot < vArray(k, tmpHi) And tmpHi > inLow)
    tmpHi = tmpHi - 1
  Wend

  If (tmpLow <= tmpHi) Then
    tmpSwap = vArray(k, tmpLow)
    vArray(k, tmpLow) = vArray(k, tmpHi)
    vArray(k, tmpHi) = tmpSwap
    tmpLow = tmpLow + 1
    tmpHi = tmpHi - 1
 End If
Wend

If (inLow < tmpHi) Then QuickSort vArray, inLow, tmpHi
If (tmpLow < inHi) Then QuickSort vArray, tmpLow, inHi
End Sub
Pᴇʜ
  • 56,719
  • 10
  • 49
  • 73
Passer-by
  • 71
  • 2
  • 10
  • What happens if you try moving the `Randomize` call outside the loops? After you declare your variables, and before the first loop. – Josh Eller Feb 27 '19 at 18:03
  • Thanks for your reply, just tried now only get about 300 duplicates, why was that? – Passer-by Feb 27 '19 at 18:11
  • No problem! Easy mistake to make when working with random values. I added the 'why' as an answer. – Josh Eller Feb 27 '19 at 18:18
  • 1
    Thinking object oriented: Put your numbers in a collection, instead of an array. Then you can set your random pick from `Collection.Count` and remove the item without having to `ReDim` etc. Nor will you have to do complicated checks because you know that each pick will be unique. This is not an answer to your OP question, so just added as a comment. – AJD Feb 27 '19 at 20:21

1 Answers1

1

Random numbers in VBA (and in pretty much every language) aren't actually random; they're pseudo-random. Software can't pull random numbers from thin air, they have to come from some algorithm; that algorithm needs some input value. The input value to a random number generator is called a seed. A random number algorithm will always generate the same 'random' value for a given input seed.

If you look at the documentation for Randomize, if you don't supply a seed value, it gets it from the system time. Since the loops are happening very quickly, the system time ends up being the same for some iterations, and Randomize just sets the seed exactly the same as it did in the last loop.

So, you just want to call the Randomize function once, at the start of your program, instead of during every loop. This ensures that your overall outcome is random (if the Randomize function is not called, Rnd uses the value last returned from Rnd as its input seed.)

Josh Eller
  • 2,045
  • 6
  • 11
  • So if seed value is omitted, what format is the system time? For example, if it is 2019-02-27 10:55:28 PM returned by the VBA function Now, if I were to input a time as seed manually, what format would it be? Also other than time, what can be used as seed? Numbers only? And in order to best mimic True random number generator, how to use Randomize with Rnd? Should I assign some seed to Randomize, or just use system time by default? – Passer-by Feb 28 '19 at 04:16
  • @Passer-by Calling Randomize doesn't make your values any more random in the sense that you're thinking. It makes your values more or less _repeatable_. If you got rid of it entirely, your program would behave exactly the same way it does now. The difference being, because Rnd uses the last value generated as its input seed, if someone knew one of your random values, they would be able to generate the same list of every random value after that one. Use Randomize with a given seed if you want to be able to repeat your random sequence, and use it with the default system time if you don't. – Josh Eller Feb 28 '19 at 13:21