0

While emulating dice rolls with the Rnd function I noticed some of the outcomes were more frequent than they were supposed to be.

Example code:

' Note, depending on computer speed this procedure may take about a minute to run
Sub sim3()

    Dim intFirst As Integer, intSecond As Integer, intDie1 As Integer, intDie2 As Integer
    Dim i As Long, j As Long, lngCount As Long, lngExpected As Long, lngLowerCount As Long, lngIterations As Long

    lngIterations = 1000000

    ' select dice roll
    intDie1 = 1 ' any number between 1 and 6
    intDie2 = 3 ' any number between 1 and 6
        
    ' expected frequency 
    ' (= 55,555 if lngIterations = 1,000,000 and intDie1 <> intDie2, = 27777 if lngIterations = 1,000,000 and intDie1 = intDie2)
    If intDie1 = intDie2 Then lngExpected = Int((1 / 36) * CDbl(lngIterations)) Else _
        lngExpected = Int((2 / 36) * CDbl(lngIterations)) 

    For i = 1 To 100
        
        lngCount = 0
        
        For j = 1 To lngIterations
            
            If j Mod 10000 = 0 Then DoEvents ' outcomment for faster execution
            
            intFirst = randomDie
            intSecond = randomDie
            
            ' count occurences of specific outcomes
            If intFirst = intDie1 And intSecond = intDie2 Then ' 1,4
                lngCount = lngCount + 1
            ElseIf intFirst = intDie2 And intSecond = intDie1 Then ' 4, 1
                lngCount = lngCount + 1
            End If
                        
        Next j
        
        If lngCount < lngExpected Then lngLowerCount = lngLowerCount + 1
        
        Debug.Print i & ": #favourable outcomes: " & lngCount ' outcomment for faster execution
        
    Next i
    
    Debug.Print "(" & intDie1 & "," & intDie2 & ") #expected favourable outcomes per iteration (int.): " & lngExpected
    Debug.Print "(" & intDie1 & "," & intDie2 & ") #iterations with lower than expected number of favourable outcomes: " & lngLowerCount
    Debug.Print "(" & intDie1 & "," & intDie2 & ") Prob. of obtaining result or lower, F(x|n,p) : " & WorksheetFunction.Binom_Dist(lngLowerCount, i, 0.5, True)
    
End Sub

The randomDie function used in the procedure is standard code for generating an integer between 1 and 6 (source):

Function randomDie() As Integer
    Randomize
    randomDie = Int((6 * Rnd) + 1)
End Function

Notice the Randomize statement which shifts the seed number of VBA's PRNG algorithm each time the function is called which means the results of the sim3 procedure are not the same each time it is executed.

The results for the 21 combos of dice rolls along with the probability of obtaining that or a lower result:

enter image description here

We would expect the results of the favourable outcomes to be about evenly distributed around the mean (μ = 50, i=100), but these results are absolutely extreme.

Are there flaws in my code, is my computer the problem or is the VBA PRNG biased?

Community
  • 1
  • 1
Miqi180
  • 1,670
  • 1
  • 18
  • 20

2 Answers2

2

Do not call Randomize every time you need a random number. This is where the bug is.

Randomize uses the system timer with a resolution of a 50 milliseconds to change the seed. Call it twice within this time window and it will result in the same random number sequence.

So just run your code with Rnd calls multiple times, and Randomize just once (if any).

Function randomDie() As Long
    randomDie = CLng((6 * Rnd) + 1)
End Function

PS. Avoid using Integer in VBA as it is a 16bit number than can overflow easily. Use Long instead everywhere, which is a nice native 32bit integer.

JAlex
  • 1,486
  • 8
  • 19
  • In addition to your P.S. - I believe using `Integer` in VBA is actually slower than using `Long` because it gets boxed? – jamheadart Jul 09 '20 at 14:47
  • @jamheadart - I am not sure why it would be boxed, but it would use only the lower part of the 32-bit registers and discard the rest. I am not aware of any speed impact to this. Boxing occurs when the value is stored in the heap as opposed to the stack, and that is something internal to VBA and not a function of the data type. – JAlex Jul 09 '20 at 14:50
  • @JAlex: interesting, I'll test and get back. Excatly where do you get this from? It says nothing about that in the [MSDN documentation](https://learn.microsoft.com/en-us/office/vba/language/reference/user-interface-help/randomize-statement) about `Randomize`. – Miqi180 Jul 09 '20 at 14:52
  • @jamheadart open an immediate window and type `? 5000*500` to receive an overflow error. But `? 50000*50` does not error. This is because numbers less than 16384 are `Integer`. In the first case `Integer*Integer = Integer` and the result does not fit. In the second case `Long*Integer = Long` and everything is fine. It is **this simple** to mess up with `Integer` types. – JAlex Jul 09 '20 at 14:54
  • I thought boxing was the term used for conversion between types, like a date to a double or a 16 bit Integer and a 32 bit Long. But I have no idea why I thought that, probably from some thread a long long long time ago. – jamheadart Jul 09 '20 at 14:57
  • 1
    Moving `Randomize` to the `sim3` procedure did the trick. I also want to mention that I normally only use the data type `Long`, except for cases like this where there is absolutely no chance of an overflow error. – Miqi180 Jul 09 '20 at 15:07
  • 1
    @Miqi180 - even if there's no chance of an Overflow, you still should use `Long`, see [this](https://stackoverflow.com/questions/26409117/why-use-integer-instead-of-long). – BigBen Jul 09 '20 at 15:10
  • "Boxing" (in .NET) is when a value type gets "wrapped" and stored as an object on the heap. https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/types/boxing-and-unboxing – Tim Williams Jul 09 '20 at 15:57
0

You could leave the pseudo-random method and go for truly random numbers as described in my project VBA.Random.

It comes with the traditional throw dice demo:

' Simulate trows of dice, and return and list the result.
' Calculates and prints the average pip value and its
' offset from the ideal average.
'
' Example:
'   ThrowDice 10, 7
'
'               Die 1         Die 2         Die 3         Die 4         Die 5         Die 6         Die 7         Die 8         Die 9         Die 10
' Throw 1          3             6             3             2             4             1             5             3             3             2
' Throw 2          1             3             1             6             5             1             1             3             2             2
' Throw 3          4             1             1             5             5             3             2             1             4             4
' Throw 4          3             3             6             6             5             3             1             4             6             4
' Throw 5          5             1             6             6             2             6             6             2             4             6
' Throw 6          6             3             1             5             6             4             2             5             6             5
' Throw 7          4             2             5             3             3             1             6             3             2             1
'
' Average pips: 3.50          0,00% off
'
' Note: Even though this example _is_ real, don't expect the average pips to be exactly 3.50.
'
' 2019-12-26. Gustav Brock, Cactus Data ApS, CPH.
'
Public Function ThrowDice( _
    Optional Throws As Integer = 1, _
    Optional Dice As Integer = 1) _
    As Integer()
    
    ' Array dimensions.
    Const DieDimension      As Long = 1
    Const ThrowDimension    As Long = 2
    
    ' Pip values.
    Const MaximumPip        As Double = 6
    Const MinimumPip        As Double = 1
    ' The average pip equals the median pip.
    Const AveragePip        As Double = (MinimumPip + MaximumPip) / 2
    Const NeutralPip        As Double = 0
    
    Dim DiceTrows()         As Integer
    Dim Die                 As Integer
    Dim Throw               As Integer
    Dim Size                As Long
    Dim Total               As Double
    
    If Dice <= 0 Or Throws <= 0 Then
        ' Return one throw of one die with unknown (neutral) result.
        Throws = 1
        Dice = 1
        Size = 0
    Else
        ' Prepare retrieval of values.
        Size = Throws * Dice
        QrnIntegerSize Size
        QrnIntegerMaximum MaximumPip
        QrnIntegerMinimum MinimumPip
    End If
    
    ReDim DiceTrows(1 To Dice, 1 To Throws)

    If Size > 0 Then
        ' Fill array with results.
        For Throw = LBound(DiceTrows, ThrowDimension) To UBound(DiceTrows, ThrowDimension)
            For Die = LBound(DiceTrows, DieDimension) To UBound(DiceTrows, DieDimension)
                DiceTrows(Die, Throw) = QrnInteger
                Total = Total + DiceTrows(Die, Throw)
            Next
        Next
    End If
    
    ' Print header line.
    Debug.Print , ;
    For Die = LBound(DiceTrows, DieDimension) To UBound(DiceTrows, DieDimension)
        Debug.Print "Die" & Str(Die), ;
    Next
    Debug.Print
    
    ' Print results.
    For Throw = LBound(DiceTrows, ThrowDimension) To UBound(DiceTrows, ThrowDimension)
        Debug.Print "Throw" & Str(Throw);
        For Die = LBound(DiceTrows, DieDimension) To UBound(DiceTrows, DieDimension)
            Debug.Print , "   " & DiceTrows(Die, Throw);
        Next
        Debug.Print
    Next
    Debug.Print
    
    ' Print total.
    If DiceTrows(1, 1) = NeutralPip Then
        ' No total to print.
    Else
        Debug.Print "Average pips:", Format(Total / Size, "0.00"), Format((Total / Size - AveragePip) / AveragePip, "Percent") & " off"
        Debug.Print
    End If
    
    ThrowDice = DiceTrows

End Function
Gustav
  • 53,498
  • 7
  • 29
  • 55
  • Thanks for pointing out this interesting project. There is indeed a difference between PRNGs and truly random numbers. I tried `ThrowDice` in the test file in the link you provided and it appears to work well. I haven't had time to properly analyze the whole code yet, though. Any comments on how to best run really large simulations with billions of iterations with your code? – Miqi180 Aug 18 '20 at 18:57
  • Uh, this demo was just for fun. For that amount of iterations, VBA will most likely be too slow. I ran a test without the debug, and 1000 * 1000 took about 12s, so I guess you will need C# or similar to speed it up. Also, I don't know of an upper limit of numbers per batch to retrieve from the service. Ultimately, for real speed, you may obtain the physical PCI card. – Gustav Aug 19 '20 at 09:03