Random numbers based on a probability

Question

I'm having trouble generating random numbers that do not follow a discrete uniform distribution.

So for example, say I have 5 numbers (to keep it simple), a probability of number k being generated would be k/15. (k = 1 to 5)

My idea is to generate a random number j using rand(), and if this number j is:

1 => then number 1 is generated

2 or 3 => num 2

4 or 5 or 6 => num 3

7 or 8 or 9 or 10 => num 4

11 or 12 or 13 or 14 or 15 => num 5

Now scale this to generating 1-10, 1-100, 1-1000. Does this work the way I intend it to? I've constructed a loop that pretty much does this every time a number needs to be generated, I think it's probably inefficient since it goes up until it finds the j number generated in one of the ranges... What could be a better way to do this?

EDIT: or maybe create an array once with the proper numbers and then pull out with rand() better solution?

http://www.cplusplus.com/reference/random/discrete_distribution/discrete_distribution/ — Mooing Duck, Jun 11 '13 at 22:41
related http://stackoverflow.com/questions/9432226/how-do-i-select-a-range-of-values-in-a-switch-statement — Ciro Santilli OurBigBook.com, Feb 19 '17 at 18:15

score 14 · Answer 1 · answered Jun 08 '12 at 02:32

14

You seem to be on the right track, but C++ already has a specialized random number distribution for that, std::discrete_distribution

#include <iostream>
#include <vector>
#include <map>
#include <random>

int main()
{
    std::random_device rd;
    std::mt19937 gen(rd());

    // list of probabilities    
    std::vector<double> p = {0, 1.0/15, 2.0/15, 3.0/15, 4.0/15, 5.0/15};
    // could also be min, max, and a generating function (n/15 in your case?)
    std::discrete_distribution<> d(p.begin(), p.end());

    // some statistics
    std::map<int, int> m;
    for(int n=0; n<10000; ++n) {
        ++m[d(gen)];
    }
    for(auto p : m) {
        std::cout << p.first << " generated " << p.second << " times\n";
    }
}

online demo: http://ideone.com/jU1ll

answered Jun 08 '12 at 02:32

Cubbi

46,567
13
103
169

The other answers assume the intended distribution follows along with the sequence of triangular numbers but the question also refers to ranges from 1-100 and 1-1000, neither of which are triangular. So the general answer does appear to be more appropriate. – shawnt00 Jul 22 '12 at 22:24
@shawnt00 The other answers make no such assumption about the range of outputs; to output a number in the range $1$ to $100,$ they would use the $100$th triangular number. However, for a fixed distribution that would be used many times, a method like this answer might still be a good choice. – David K Nov 01 '18 at 13:40

rob mayoff · Accepted Answer · 2012-06-08T02:56:29.207

Consider that the sum s of integers from 1 to n is s = n * (n + 1) / 2. Solve for n to get n = (± sqrt(8*s + 1) - 1) / 2. We can ignore the negative square root, as we know n is positive. Thus n = (sqrt(8*s + 1) - 1) / 2.

So, plugging in integers for s between 1 and 15:

s  n
01 1.000000
02 1.561553
03 2.000000
04 2.372281
05 2.701562
06 3.000000
07 3.274917
08 3.531129
09 3.772002
10 4.000000
11 4.216991
12 4.424429
13 4.623475
14 4.815073
15 5.000000

If we take the ceiling of each computed n (the smallest integer not less than n), we get this:

Thus you can go from the uniform distribution to your distribution in constant space and constant time (no iteration and no precomputed tables):

double my_distribution_from_uniform_distribution(double s) {
    return ceil((sqrt(8*s + 1) - 1) / 2);
}

N.B. This relies on sqrt giving an exact result for a perfect square (e.g. returning exactly 7 given exactly 49). This is normally a safe assumption, because IEEE 754 requires exact rounding of square roots.

IEEE 754 doubles can represent all integers from 1 through 2^53 (and many larger integers, but not contiguously after 2^53). So this function should work correctly for all s from 1 to floor((2^53 - 1) / 8) = 1125899906842623.

score 0 · Answer 3 · edited Jun 11 '13 at 22:36

0

You can take advantage of the curious arithmetical fact that:

S(n) = 1 + 2 + 3 + ... + (n - 1) + n

or simplified:

S(n) = n * (n + 1) / 2

This lets you avoid storing the array.

edited Jun 11 '13 at 22:36

Regexident

29,441
10
93
100

answered Jun 08 '12 at 02:32

thb

13,796
3
40
68

Random numbers based on a probability

3 Answers3

Linked

Related