6

For a game I'm trying to determine the frequency that a certain # will show up at a given # of dice being rolled. I know... that question seems odd. Let me try to explain it with real numbers.

So, for 1 die, the frequency for each number will be identical. 1-6 will show up equal number of times.

Now for 2 dice, things get different. I imagine 5,6,7 are going to be the most frequently rolled, while numbers at both ends of the spectrum will show up less or not at all (in the case of 1). I'd like to know how to calculate this list and show them in the proper order, from most frequent to less frequent.

Any thoughts?


@duffymo - It would be really nice though to have some sort of an algorithm to come up with it. It seems that the above way is going to require a lot of hand picking and placing of numbers. If my die count is dynamic up to say 10, doing that by hand will be ineffecient and troublesome I think. :)

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
bugfixr
  • 7,997
  • 18
  • 91
  • 144
  • Are you adding the numbers up? Why did you come to the conclusion that 5,6 and 7 would show up more? – JoshFinnie Jan 29 '09 at 20:26
  • 1
    @JoshFinnie - it could be me that is just assuming incorrectly that 5,6,7 would be more frequent- but I base that assumption on the fact that a roll to get 5 could be 2+3 & 4+1 whereas a 3 can only show up with a 2+1; a six is 3+3, 4+2, 5+1; etc. – bugfixr Jan 29 '09 at 20:30

9 Answers9

11

There are 6*6 = 36 combinations for two dice.

2 = 1+1 can only appear once, so its frequency is 1/36. 3 = 1+2 or 2+1, so its frequency is 2/36 = 1/18. 4 = 1+3, 2+2, or 3+1, so its frequency is 3/36 = 1/12.

You can do the rest out to twelve.

Any backgammon player knows these well.

duffymo
  • 305,152
  • 44
  • 369
  • 561
5

Rough draft of a recursive way to do it:

public static IEnumerable<KeyValuePair<int, int>> GetFrequenciesByOutcome(int nDice, int nSides)
{
    int maxOutcome = (nDice * nSides);
    Dictionary<int, int> outcomeCounts = new Dictionary<int, int>();
    for(int i = 0; i <= maxOutcome; i++)
        outcomeCounts[i] = 0;

    foreach(int possibleOutcome in GetAllOutcomes(0, nDice, nSides))
        outcomeCounts[possibleOutcome] = outcomeCounts[possibleOutcome] + 1;

    return outcomeCounts.Where(kvp => kvp.Value > 0);
}

private static IEnumerable<int> GetAllOutcomes(int currentTotal, int nDice, int nSides)
{
    if (nDice == 0) yield return currentTotal;
    else
    {
        for (int i = 1; i <= nSides; i++)
            foreach(int outcome in GetAllOutcomes(currentTotal + i, nDice - 1, nSides))
                yield return outcome;
    }
}

Unless I'm mistaken, this should spit out KeyValuePairs organized like [key, frequency].

EDIT: FYI, after running this, it shows the frequencies for GetFrequenciesByOutcome(2, 6) to be:

2: 1

3: 2

4: 3

5: 4

6: 5

7: 6

8: 5

9: 4

10: 3

11: 2

12: 1

mqp
  • 70,359
  • 14
  • 95
  • 123
  • I wonder why this was downvoted, as nobody else has provided an algorithm. – mqp Jan 29 '09 at 20:47
  • #1 is a calculation, not an algorithm. It has been suggested that the calculation be done manually and then the table hard-coded (ouch) - I'd much rather have a real algorithm like this than have to manually calculate the numbers and hard code them. – bugfixr Jan 29 '09 at 21:04
  • The algorithm is fairly straightforward like bubblesort is...and with similar efficency problems. It is the *only* algorithm presented so far, but for large numbers it's going to be far more intensive than needed. Better to just code the math behind it. (Some other posts have links to the math). – Beska Jan 29 '09 at 21:22
  • That's right, the math is better. The algorithm is just a straightforward simulation of every possible dice roll, which, as you can imagine, could be a lot of dice rolls. – mqp Jan 29 '09 at 21:24
5

There is no real "algorithm" or simulation necessary - it's a simple calculation based on a formula derived by De Moivre:

http://www.mathpages.com/home/kmath093.htm

And it's not a "bell curve" or normal distribution.

Cade Roux
  • 88,164
  • 40
  • 182
  • 265
  • 1
    Not a bell curve? You sure about that? I mean, it's definitely shaped like a bell, although I'll admit that my statistics are weak enough that I could be convinced it wasn't "normal" distribution, and maybe not the classic "bell curve". – Beska Jan 29 '09 at 20:53
  • (BTW, great link...I'll vote it up if you can convince me that it's not really a bell curve.) – Beska Jan 29 '09 at 21:00
  • 1
    It's a discrete distribution, so it couldn't be a true normal distribution. However, like most distributions, it is asymptotically normal - for large numbers of dice, the normal distribution is a better approximation (and computationally easier). – user57368 Jan 29 '09 at 21:01
  • That's fine...that's what I thought and I hedged that way in my post: that it approximates a normal distribution. That was the problem I had with this answer...it implies they're two different beasts, rather than just a matter of stretching the iterations out to infinity. – Beska Jan 29 '09 at 21:18
  • The larger the number of dice, the closer the approximation to a normal distribution. For this reason (the factorials in the combinatorics) for large numbers of dice the normal distribution can be used more quickly. – Cade Roux Jan 29 '09 at 21:41
3

Add up the array of frequency of previous rolls, 'side number' times by shifting it's position, then you will get the array of frequencies each numbers show up.

1, 1, 1, 1, 1, 1  # 6 sides, 1 roll

1, 1, 1, 1, 1, 1
   1, 1, 1, 1, 1, 1
      1, 1, 1, 1, 1, 1
         1, 1, 1, 1, 1, 1
            1, 1, 1, 1, 1, 1
+              1, 1, 1, 1, 1, 1
_______________________________
1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1  # 6 sides, 2 rolls

1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1
   1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1
      1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1
         1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1
            1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1
+              1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1
______________________________________________
1, 3, 6,10,15,21,25,27,27,25,21,15,10, 6, 3, 1  # 6 sides, 3 rolls

This is much faster than brute force simulation, since simple equation is the best. Here is my python3 implementation.

def dice_frequency(sides:int, rolls:int) -> list:
    if rolls == 1:
        return [1]*sides
    prev = dice_frequency(sides, rolls-1)
    return [sum(prev[i-j] for j in range(sides) if 0 <= i-j < len(prev))
            for i in range(rolls*(sides-1)+1)]

for example,

dice_frequency(6,1) == [1, 1, 1, 1, 1, 1]
dice_frequency(6,2) == [1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1]
dice_frequency(6,3) == [1, 3, 6, 10, 15, 21, 25, 27, 27, 25, 21, 15, 10, 6, 3, 1]

Note that, you should use 'target number - roll count' as index of the list to get frequency of each number. If you want to get probabilities, use 'side number'^'roll count' as a denominator.

sides = 6
rolls = 3
freq = dice_frequency(sides,rolls)
freq_sum = sides**rolls
for target in range(rolls,rolls*sides+1):
    index = target-rolls
    if 0 <= index < len(freq):
        print("%2d : %2d, %f" % (target, freq[index], freq[index]/freq_sum))
    else:
        print("%2d : %2d, %f" % (target, 0, 0.0))

This code yeilds

 3 :  1, 0.004630
 4 :  3, 0.013889
 5 :  6, 0.027778
 6 : 10, 0.046296
 7 : 15, 0.069444
 8 : 21, 0.097222
 9 : 25, 0.115741
10 : 27, 0.125000
11 : 27, 0.125000
12 : 25, 0.115741
13 : 21, 0.097222
14 : 15, 0.069444
15 : 10, 0.046296
16 :  6, 0.027778
17 :  3, 0.013889
18 :  1, 0.004630
jinhwanlazy
  • 3,181
  • 1
  • 11
  • 9
2

There's lots of stuff online about dice probability. Here's one link that helped me out with a Project Euler question:

http://gwydir.demon.co.uk/jo/probability/calcdice.htm

Dana
  • 32,083
  • 17
  • 62
  • 73
1

JavaScript implementation using dynamic function creation:

<script>
var f;
function prob(dice, value)
 {
var f_s = 'f = function(dice, value) {var occur = 0; var a = [];';
for (x = 0; x < dice; x++)
 {
f_s += 'for (a[' + x + '] = 1; a[' + x + '] <= 6; a[' + x + ']++) {';
 }
f_s += 'if (eval(a.join(\'+\')) == value) {occur++;}';
for (x = 0; x < dice; x++)
 {
f_s += '}';
 }
f_s += 'return occur;}';
eval(f_s);
var occ = f(dice, value);
return [occ, occ + '/' + Math.pow(6, dice), occ / Math.pow(6, dice)];
 };

alert(prob(2, 12)); // 2 die, seeking 12
                    // returns array [1, 1/36, 0.027777777777777776]
</script>

EDIT: Rather disappointed nobody pointed this out; had to replace 6 * dice with Math.pow(6, dice). No more mistakes like that...

Hexagon Theory
  • 43,627
  • 5
  • 26
  • 30
1

Neat factoid...

Did you know that Pascal's triangle is the probability distribution of the sums of N 2-sided dice?

   1 1    - 1 die, 1 chance at 1, 1 chance at 2
  1 2 1   - 2 dice, 1 chance at 2, 2 chances at 3, 1 chance at 4
 1 3 3 1  - 3 dice, 1 chance at 3, 3 chances at 4, 3 chances at 5, 1 chance at 6 
1 4 6 4 1 - etc.
Instantsoup
  • 14,825
  • 5
  • 34
  • 41
  • This is neat! Now I'm thinking about how to generalize this iterative thought process (how you create one row after the next in Pascal's triangle) to M-sided dice. – mqp Jan 29 '09 at 21:55
0

After a lot of searching on the Internet and stackoverflow, I found Dr. Math explains it well in a working function (a link in another answer has an incorrect formula). I converted Dr. Math's formula to C# and my nUnit tests (which had been failing before with other attempts at the code) all passed.

First I had to write a few helper functions:

  public static int Factorial(this int x)
  {
     if (x < 0)
     {
        throw new ArgumentOutOfRangeException("Factorial is undefined on negative numbers");
     }
     return x <= 1 ? 1 : x * (x-1).Factorial();
  }

Because of the way choose works in mathematics, I realized I could cut down on the calculations if I had an overloaded Factorial function with a lower bound. This function can bail out when the lower bound is reached.

  public static int Factorial(this int x, int lower)
  {
     if (x < 0)
     {
        throw new ArgumentOutOfRangeException("Factorial is undefined on negative numbers");
     }
     if ((x <= 1) || (x <= lower))
     {
        return 1;
     }
     else
     {
        return x * (x - 1).Factorial(lower);
     }
  }

  public static int Choose(this int n, int r)
  {
     return (int)((double)(n.Factorial(Math.Max(n - r, r))) / ((Math.Min(n - r, r)).Factorial()));
  }

When those were in place, I was able to write

  public static int WaysToGivenSum(int total, int numDice, int sides)
  {
     int ways = 0;
     int upper = (total - numDice) / sides; //we stop on the largest integer less than or equal to this
     for (int r = 0; r <= upper; r++)
     {
        int posNeg = Convert.ToInt32(Math.Pow(-1.0, r)); //even times through the loop are added while odd times through are subtracted
        int top = total - (r * sides) - 1;
        ways += (posNeg * numDice.Choose(r) * top.Choose(numDice - 1));
     }
     return ways;
  }
Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Frank Luke
  • 1,342
  • 3
  • 18
  • 33
0

There seems to be some mystery surrounding exactly "why" this is, and although duffymo has explained part of it, I'm looking at another post that says:

There should be no reason why 5, 6 and 7 should be rolled more [than 2] since the first roll of the die is a independent event from the second roll of the die and both of them have equal probablity of 1-6 of being rolled.

There's a certain appeal to this. But it's incorrect...because the first roll affects the chances. The reasoning can probably most easily be done through an example.

Say I'm trying to figure out if the probability of rolling 2 or 7 is more likely on two dice. If I roll the first die and get a 3, what are my chances now of rolling a total of 7? Obviously, 1 in 6. What are my chances of rolling a total of 2? 0 in 6...because there's nothing I can roll on the second die to have my total be 2.

For this reason, 7 is very (the most) likely to be rolled...because no matter what I roll on the first die, I can still reach the correct total by rolling the right number on the second die. 6 and 8 are equally slightly less likely, 5 and 9 more so, and so on, until we reach 2 and 12, equally unlikely at 1 in 36 chance apiece.

If you plot this (sum vs likelyhood) you'll get a nice bell curve (or, more precisely, a blocky aproximation of one because of the discrete nature of your experiment).

Beska
  • 12,445
  • 14
  • 77
  • 112