Get random number with larger numbers increasingly unlikely

Question

How can I get a random number in the range k to h such that the closer a number is to h the more unlikely it will come up?

I'm going to need a number between 20 and 1980.

Use magic called math. Figure out the equation for your desired distribution. — Hovercraft Full Of Eels, Jul 04 '14 at 20:02
posible duplicate http://stackoverflow.com/questions/5271598/java-generate-random-number-between-two-given-values — Federico Piazza, Jul 04 '14 at 20:03
Know how in Settlers of Catan you rarely get the low and high numbers? It's because you have two dice, and add their values. Go figure. — MightyPork, Jul 04 '14 at 20:04
@SteveP. I was thinking something among the lines of MAX*abs([RAND+RAND] - 1) — MightyPork, Jul 04 '14 at 20:57
@SteveP.: the most general solution is with a distribution function of some sort, that takes input from 0 to 1 and outputs a desired result. Your current distribution is just one specific linear distribution, but there are many many more that can be used. — Hovercraft Full Of Eels, Jul 04 '14 at 20:57
Btw, considering it's java, just some trick with Random.nextGaussian() should get the desired results. Clamp it, abs it or something to get the right range and values. — MightyPork, Jul 04 '14 at 20:58
Check out my answer, perhaps you'll like some of the distributions I found. — MightyPork, Jul 04 '14 at 21:24
There are an infinite number of possible distributions that meet your request. You'll have to narrow it down a bit. — pjs, Jul 04 '14 at 21:28

score 6 · Accepted Answer · answered Jul 04 '14 at 21:21

I've tried some stuff in Eclipse, here are results.

interface Generator {       
    double generate(double low, double high);
}


abstract class AbstractGenerator implements Generator {         
    protected final Random rand;

    public AbstractGenerator()
    {
        rand = new Random();
    }

    public AbstractGenerator(long seed)
    {
        rand = new Random(seed);
    }
}

Now results for various generator implementations:

I've tried to generate 100k numbers on scale 0 to 9, and here they are shown as bars.

Catan 2 (add two dice)

class Catan2 extends AbstractGenerator {

    @Override
    public double generate(double low, double high)
    {
        return low + (high - low) * Math.abs(-1 + (rand.nextDouble() + rand.nextDouble()));
    }
}

Reusults:

0 : *******************
1 : ******************
2 : ****************
3 : **************
4 : ************
5 : *********
6 : *******
7 : *****
8 : ***
9 : *

Catan 3 (add three dice)

class Catan3 extends AbstractGenerator {

    @Override
    public double generate(double low, double high)
    {
        return low + (high - low) * Math.abs(-1.5 + (rand.nextDouble() + rand.nextDouble() + rand.nextDouble())) / 1.5;
    }
}

Reusults:

0 : ***********************
1 : *********************
2 : *******************
3 : ***************
4 : ***********
5 : *******
6 : *****
7 : ***
8 : *
9 : *

Catan 4 (add four dice)

class Catan4 extends AbstractGenerator {

    @Override
    public double generate(double low, double high)
    {
        return low + (high - low) * Math.abs(-2 + (rand.nextDouble() + rand.nextDouble() + rand.nextDouble() + rand.nextDouble())) / 2D;
    }
}

Results:

0 : ***************************
1 : ************************
2 : ********************
3 : **************
4 : *********
5 : *****
6 : ***
7 : *
8 : *
9 : *

I think "Catan 3" is the best of those.

Formula being: low+(high-low)*abs(-1.5+(RAND+RAND+RAND))/1.5

Basically, I get a "hill" distribution, then I center it and take it's abs value. Then I norm it to the desired values.

rossum · Answer 2 · 2014-07-05T19:16:51.157

3

And yet another option. There are standard methods to produce random numbers on a Gaussian distribution. Set up a Gaussian RNG with an average of k and a standard deviation of h/5. Reject any number below k (about half the numbers generated) and reject all numbers greater than h (5% or less).

You can tweak the standard deviation if you want to optimise the results. Effectively this is a half-Gaussian RNG with a truncated tail, so the numbers are not linear; you will get more closer to k than to h.

ETA: Thanks to @MightyPork's comment, which got me thinking. A Gaussian distribution is symmetric, so there is no need to throw away any raw values less than k. Just shift them from below k to the same distance above k:

if (raw < k)
  raw <- k + (k - raw)
end if

Values above h will still need to be rejected.

edited Jul 05 '14 at 19:16

answered Jul 04 '14 at 22:26

rossum

15,344
1
24
38

You mean Random.nextGaussian()? How'd you do the truncating - a loop that generates numbers until some falls within range? – MightyPork Jul 05 '14 at 18:55
@MightyPork: Yes. Generate a number, test it, and only pass it on if it is within range. Just over half the numbers generated will fail. – rossum Jul 05 '14 at 19:04

Steve P. · Answer 3 · 2014-07-04T20:32:53.207

2

Say our range is [0,4], create an array like this:

[000001111222334]

Now use a standard Random object to draw from the array. By doing this, we have gone from drawing from a uniform distribution to a distribution of our own design. In reality, we're not going to want to use an auxiliary array. You can do the following in lieu of an auxiliary array:

Draw from [0,14]; map [0,4] to 0, [5,8] to 1, [9,11] to 2, [12,13] to 3 and [14] to 4.

It really depends on what your distribution looks like. You can approximate drawing from a non-uniform distribution via drawing multiple times from uniform distributions over varying ranges. Of course, if you know the probability mass function or probability density function of your distribution, then you're golden.

edited Jul 04 '14 at 20:32

answered Jul 04 '14 at 20:08

Steve P.

14,489
8
42
72

I'm going to need a number between 20 and 1980, using your method would create a very large array. – Ava Jul 04 '14 at 20:15
@RedHatter, added that to your question. – Michael Petrotta Jul 04 '14 at 20:19
@RedHatter My second edit will help you the most, I assume. – Steve P. Jul 04 '14 at 20:31

Gene · Answer 4 · 2014-07-04T22:19:52.003

If you need good control over the distribution of numbers, then a good way to go is the method of inverses. Create a sorted table of (x,y) pairs where x and y both increase monotonically: x from 0 to 1 and y from the low to high value of pseudo-random numbers you need. The algorithm is:

x = uniform random float in [0..1)
Search the table to find (x[i],y[i]) such that x[i] <= x < x[i+1]
// Return linearly interpolated y value
return y[i] + (x - x[i]) / (x[i+1] - x[i]) * (y[i+1] - y[i])

You control the distribution of return values with the table entries.

If the table contains only (0,0) and (1,1), then obviously the return value is equal to x, and the distribution is uniform. To get more high numbers, describe a curve that increases more rapidly at the start and is flatter at the higher x values, say:

(0,0) (0.25,0.5) (1,1)

You should be able to see why this works. In the uniform distribution, half the numbers are between 0 and .5. With this table, only a quarter of the numbers are in that range, so the other three-quarters are in 0.5 to 1. The high numbers are more frequent as you require.

You can create as smooth a curve as you like and of any shape as long as it's monotonically increasing. If the table has more than a few pairs, consider binary search for speed.

For a range of 20 to 1980, the corresponding table would be something like:

(0, 20) (0.25, 1000) (1, 1980)

If you need integers, you'd want to use

(0, 20) (0.25, 1000) (1, 1981)

and then truncate the fraction from the result.

Again, you'd probably want more points in the table to make the ICDF smoother. This is for illustration.

The Math

The curve stored in the table is called the inverse cumulative density function (ICDF) for returned pseudo-random numbers. A probability distribution function (PDF) is a non-negative function with area under the curve of 1. Commonly used PFDs are uniform, exponential, and normal. The corresponding CDF is the running integral of the PDF. The ICDF is the inverse of the CDF. It's well known that to generate random numbers with any given PDF, you can find the ICDF and apply the algorithm above.

Get random number with larger numbers increasingly unlikely

4 Answers4

Catan 2 (add two dice)

Catan 3 (add three dice)

Catan 4 (add four dice)