63

I'm trying to devise a (good) way to choose a random number from a range of possible numbers where each number in the range is given a weight. To put it simply: given the range of numbers (0,1,2) choose a number where 0 has an 80% probability of being selected, 1 has a 10% chance and 2 has a 10% chance.

It's been about 8 years since my college stats class, so you can imagine the proper formula for this escapes me at the moment.

Here's the 'cheap and dirty' method that I came up with. This solution uses ColdFusion. Yours may use whatever language you'd like. I'm a programmer, I think I can handle porting it. Ultimately my solution needs to be in Groovy - I wrote this one in ColdFusion because it's easy to quickly write/test in CF.

public function weightedRandom( Struct options ) {

    var tempArr = [];

    for( var o in arguments.options )
    {
        var weight = arguments.options[ o ] * 10;
        for ( var i = 1; i<= weight; i++ )
        {
            arrayAppend( tempArr, o );
        }
    }
    return tempArr[ randRange( 1, arrayLen( tempArr ) ) ];
}

// test it
opts = { 0=.8, 1=.1, 2=.1  };

for( x = 1; x<=10; x++ )
{
    writeDump( weightedRandom( opts ) );    
}

I'm looking for better solutions, please suggest improvements or alternatives.

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
Todd Sharp
  • 3,207
  • 2
  • 19
  • 27

16 Answers16

95

Rejection sampling (such as in your solution) is the first thing that comes to mind, whereby you build a lookup table with elements populated by their weight distribution, then pick a random location in the table and return it. As an implementation choice, I would make a higher order function which takes a spec and returns a function which returns values based on the distribution in the spec, this way you avoid having to build the table for each call. The downsides are that the algorithmic performance of building the table is linear by the number of items and there could potentially be a lot of memory usage for large specs (or those with members with very small or precise weights, e.g. {0:0.99999, 1:0.00001}). The upside is that picking a value has constant time, which might be desirable if performance is critical. In JavaScript:

function weightedRand(spec) {
  var i, j, table=[];
  for (i in spec) {
    // The constant 10 below should be computed based on the
    // weights in the spec for a correct and optimal table size.
    // E.g. the spec {0:0.999, 1:0.001} will break this impl.
    for (j=0; j<spec[i]*10; j++) {
      table.push(i);
    }
  }
  return function() {
    return table[Math.floor(Math.random() * table.length)];
  }
}
var rand012 = weightedRand({0:0.8, 1:0.1, 2:0.1});
rand012(); // random in distribution...

Another strategy is to pick a random number in [0,1) and iterate over the weight specification summing the weights, if the random number is less than the sum then return the associated value. Of course, this assumes that the weights sum to one. This solution has no up-front costs but has average algorithmic performance linear by the number of entries in the spec. For example, in JavaScript:

function weightedRand2(spec) {
  var i, sum=0, r=Math.random();
  for (i in spec) {
    sum += spec[i];
    if (r <= sum) return i;
  }
}
weightedRand2({0:0.8, 1:0.1, 2:0.1}); // random in distribution...
doelleri
  • 19,232
  • 5
  • 61
  • 65
maerics
  • 151,642
  • 46
  • 269
  • 291
  • 2
    Note that you can store an array giving the cumulative sums, ie do it once, and then use a `log n` binary search each time you generate a number. But that only makes sense for large n. – dan-man May 07 '16 at 14:03
  • If i run the function with these parameters arr = {0:0.1, 1:0.7, 2:0.9} 10000 times, it gives me this output : 0 : 983 , 1 : 7011 and 2 : 2006 which is all wrong because 2 has more probability than 1 while outout suggest something different. – Rüzgar May 06 '17 at 10:47
  • @maerics Hey just a quick check with you, does the sum of the weight need to be exactly 1? I tried this weightedRand({0:0.350, 1:0.200, 2:0.010, 3:0.150 , 4:0.010, 5:0.200, 6:0.150 }); but I realized number 4 often comes up with a very large number – QWERTY Dec 02 '17 at 09:26
  • @hyperfkcb yes, the sum of the weights must be one and for those weights you'll need to use the constant value 1000 instead of 10. – maerics Dec 02 '17 at 15:30
  • @maerics Thanks for the clarification! But may I know what you mean by constant value 1000 instead of 10? – QWERTY Dec 06 '17 at 07:21
  • @hyperfkcb read the code for the `weightedRand(...)` function in the answer and try to figure out where it would make sense to use 1000 instead of 10... – maerics Dec 06 '17 at 15:22
  • @maerics Sorry but I am not that good with numbering. Does that means I should keep all the weights as one decimal point and the sum for all of them should be 1? – QWERTY Dec 07 '17 at 00:11
  • `weightedRand2` needs to accept a variable `total` and then needs `var r = Math.floor(Math.random() * (total + 1));` – Elijah Mock Aug 04 '21 at 22:13
  • As an alternative, instead building a table to quickly look up the element you could build an interval tree from the 'weight ranges' :) This would reduce the memory requirement but increase the look up time to log(N). – Katona Feb 09 '22 at 14:25
19

Generate a random number R between 0 and 1.

If R in [0, 0.1) -> 1

If R in [0.1, 0.2) -> 2

If R in [0.2, 1] -> 3

If you can't directly get a number between 0 and 1, generate a number in a range that will produce as much precision as you want. For example, if you have the weights for

(1, 83.7%) and (2, 16.3%), roll a number from 1 to 1000. 1-837 is a 1. 838-1000 is 2.

Thomas Eding
  • 35,312
  • 13
  • 75
  • 106
  • 1
    A friend of mine came up with this variation on this approach: return Math.random() < 0.8 ? 0 : ( Math.random() < 0.9 ? 1 : 2 ); – Todd Sharp Dec 08 '11 at 19:13
  • I would not recommend that unless you are dealing with conditional probablities, which that models best. – Thomas Eding Dec 08 '11 at 19:40
  • 6
    @ToddSharp I know it's ancient, but ... you'd actually want to use the same random number, or you'll get a bias: r = Math.random(); return (r < 0.8) ? 0 : (r<.9) ? 1 : 2. In your code, '2' would only be returned if r1>=.8 AND r2>=.9, which is 10% of 20% or 2% of the cases. – jimm101 Jul 06 '16 at 20:37
16

I use the following

function weightedRandom(min, max) {
  return Math.round(max / (Math.random() * max + min));
}

This is my go-to "weighted" random, where I use an inverse function of "x" (where x is a random between min and max) to generate a weighted result, where the minimum is the most heavy element, and the maximum the lightest (least chances of getting the result)

So basically, using weightedRandom(1, 5) means the chances of getting a 1 are higher than a 2 which are higher than a 3, which are higher than a 4, which are higher than a 5.

Might not be useful for your use case but probably useful for people googling this same question.

After a 100 iterations try, it gave me:

==================
| Result | Times |
==================
|      1 |    55 |
|      2 |    28 |
|      3 |     8 |
|      4 |     7 |
|      5 |     2 |
==================
Tom Roggero
  • 5,777
  • 1
  • 32
  • 39
  • What are the use cases for this? I tried `weightedRandom(50, 100)` but still recieved 1s and such, I obviously missed the point. – Solo Apr 15 '19 at 22:41
  • 2
    @Solo a couple things: (1) this approach is very specific, since it gives a huge weight (priority) to the lowest numbers, close to `f(x)=1/x `... (2) given it uses random, there's no guarantee it will use at least once every number... and (3) last but not least, you should use `49 + weightedRandom(1, 51)` if you want to get numbers between 50 and 100 – Tom Roggero Apr 18 '19 at 20:16
  • Duh, `49 + weightedRandom(1, 51)` is so obvious solution. Thank you. – Solo Apr 18 '19 at 20:19
  • this is a top solution! – Emmanuel N K Jan 19 '20 at 07:01
  • The perfect solution to make some test data look a bit more convincing in graphs. Thanks so much for this clever little snippet. – counterbeing Feb 13 '20 at 02:54
  • While it works please note, it only works for `min > 0`. Otherwise you will receive number outside the requested range. – JimmyBlu Jul 18 '22 at 13:40
15

8 years late but here's my solution in 4 lines.

  1. Prepare an array of probability mass function such that

    pmf[array_index] = P(X=array_index):

    var pmf = [0.8, 0.1, 0.1];
    

    (Ensure they add up to 1.)

  2. Prepare an array for the corresponding cumulative distribution function such that

    cdf[array_index] = F(X=array_index):

    var cdf = pmf.map((sum => value => sum += value)(0));
    // [0.8, 0.9, 1]
    
  1. Generate a random number.

    var rand = Math.random();
    
  2. Get the index of the element which is more than or equals to the random number.

    cdf.indexOf(el => rand >= el);
    

(Updated thanks to @Joonas's comment.)

remykarem
  • 2,251
  • 22
  • 28
  • the "index of the element which is more than or is equal to the random number" would be `rand <= el`, not `rand >= el`. But otherwise, this solution is very clever. – Braden.Biz Jun 19 '23 at 17:56
13

Here are 3 solutions in javascript since I'm not sure which language you want it in. Depending on your needs one of the first two might work, but the the third one is probably the easiest to implement with large sets of numbers.

function randomSimple(){
  return [0,0,0,0,0,0,0,0,1,2][Math.floor(Math.random()*10)];
}

function randomCase(){
  var n=Math.floor(Math.random()*100)
  switch(n){
    case n<80:
      return 0;
    case n<90:
      return 1;
    case n<100:
      return 2;
  }
}

function randomLoop(weight,num){
  var n=Math.floor(Math.random()*100),amt=0;
  for(var i=0;i<weight.length;i++){
    //amt+=weight[i]; *alternative method
    //if(n<amt){
    if(n<weight[i]){
      return num[i];
    }
  }
}

weight=[80,90,100];
//weight=[80,10,10]; *alternative method
num=[0,1,2]
qw3n
  • 6,236
  • 6
  • 33
  • 62
10

This is more or less a generic-ized version of what @trinithis wrote, in Java: I did it with ints rather than floats to avoid messy rounding errors.

static class Weighting {

    int value;
    int weighting;

    public Weighting(int v, int w) {
        this.value = v;
        this.weighting = w;
    }

}

public static int weightedRandom(List<Weighting> weightingOptions) {

    //determine sum of all weightings
    int total = 0;
    for (Weighting w : weightingOptions) {
        total += w.weighting;
    }

    //select a random value between 0 and our total
    int random = new Random().nextInt(total);

    //loop thru our weightings until we arrive at the correct one
    int current = 0;
    for (Weighting w : weightingOptions) {
        current += w.weighting;
        if (random < current)
            return w.value;
    }

    //shouldn't happen.
    return -1;
}

public static void main(String[] args) {

    List<Weighting> weightings = new ArrayList<Weighting>();
    weightings.add(new Weighting(0, 8));
    weightings.add(new Weighting(1, 1));
    weightings.add(new Weighting(2, 1));

    for (int i = 0; i < 100; i++) {
        System.out.println(weightedRandom(weightings));
    }
}
Greg Case
  • 3,200
  • 1
  • 19
  • 17
4

How about

int [ ] numbers = { 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 1 , 2 } ;

then you can randomly select from numbers and 0 will have an 80% chance, 1 10%, and 2 10%

emory
  • 10,725
  • 2
  • 30
  • 58
  • 3
    This works, but there's no need to allocate an array. What if you have to deal with very precise weights, such as 4.68342%? You need to allocate an array of size at least 10000000. – Thomas Eding Dec 08 '11 at 18:15
1

This one is in Mathematica, but it's easy to copy to another language, I use it in my games and it can handle decimal weights:

weights = {0.5,1,2}; // The weights
weights = N@weights/Total@weights // Normalize weights so that the list's sum is always 1.
min = 0; // First min value should be 0
max = weights[[1]]; // First max value should be the first element of the newly created weights list. Note that in Mathematica the first element has index of 1, not 0.
random = RandomReal[]; // Generate a random float from 0 to 1;
For[i = 1, i <= Length@weights, i++,
    If[random >= min && random < max,
        Print["Chosen index number: " <> ToString@i]
    ];
    min += weights[[i]];
    If[i == Length@weights,
        max = 1,
        max += weights[[i + 1]]
    ]
]

(Now I'm talking with a lists first element's index equals 0) The idea behind this is that having a normalized list weights there is a chance of weights[n] to return the index n, so the distances between the min and max at step n should be weights[n]. The total distance from the minimum min (which we put it to be 0) and the maximum max is the sum of the list weights.

The good thing behind this is that you don't append to any array or nest for loops, and that increases heavily the execution time.

Here is the code in C# without needing to normalize the weights list and deleting some code:

int WeightedRandom(List<float> weights) {
    float total = 0f;
    foreach (float weight in weights) {
        total += weight;
    }

    float max = weights [0],
    random = Random.Range(0f, total);

    for (int index = 0; index < weights.Count; index++) {
        if (random < max) {
            return index;
        } else if (index == weights.Count - 1) {
            return weights.Count-1;
        }
        max += weights[index+1];
    }
    return -1;
}
Garmekain
  • 321
  • 2
  • 9
1

here is the input and ratios : 0 (80%), 1(10%) , 2 (10%)

lets draw them out so its easy to visualize.

                0                       1        2
-------------------------------------________+++++++++

lets add up the total weight and call it TR for total ratio. so in this case 100. lets randomly get a number from (0-TR) or (0 to 100 in this case) . 100 being your weights total. Call it RN for random number.

so now we have TR as the total weight and RN as the random number between 0 and TR.

so lets imagine we picked a random # from 0 to 100. Say 21. so thats actually 21%.

WE MUST CONVERT/MATCH THIS TO OUR INPUT NUMBERS BUT HOW ?

lets loop over each weight (80, 10, 10) and keep the sum of the weights we already visit. the moment the sum of the weights we are looping over is greater then the random number RN (21 in this case), we stop the loop & return that element position.

double sum = 0;
int position = -1;
for(double weight : weight){
position ++;
sum = sum + weight;
if(sum > 21) //(80 > 21) so break on first pass
break;
}
//position will be 0 so we return array[0]--> 0

lets say the random number (between 0 and 100) is 83. Lets do it again:

double sum = 0;
int position = -1;
for(double weight : weight){
position ++;
sum = sum + weight;
if(sum > 83) //(90 > 83) so break
break;
}

//we did two passes in the loop so position is 1 so we return array[1]---> 1
Community
  • 1
  • 1
j2emanue
  • 60,549
  • 65
  • 286
  • 456
  • Great algorithm for array with values as weight. I did it in javascript. I needed to randomly select an id in collection (coming from database), but I need to give more weight for recent records (with biggest id). – KeitelDOG Feb 25 '23 at 21:56
1

I suggest to use a continuous check of the probability and the rest of the random number.

This function sets first the return value to the last possible index and iterates until the rest of the random value is smaller than the actual probability.

The probabilities have to sum to one.

function getRandomIndexByProbability(probabilities) {
    var r = Math.random(),
        index = probabilities.length - 1;

    probabilities.some(function (probability, i) {
        if (r < probability) {
            index = i;
            return true;
        }
        r -= probability;
    });
    return index;
}

var i,
    probabilities = [0.8, 0.1, 0.1],
    count = probabilities.map(function () { return 0; });

for (i = 0; i < 1e6; i++) {
    count[getRandomIndexByProbability(probabilities)]++;
}

console.log(count);
.as-console-wrapper { max-height: 100% !important; top: 0; }
Nina Scholz
  • 376,160
  • 25
  • 347
  • 392
1

Thanks all, this was a helpful thread. I encapsulated it into a convenience function (Typescript). Tests below (sinon, jest). Could definitely be a bit tighter, but hopefully it's readable.

export type WeightedOptions = {
    [option: string]: number;
};

// Pass in an object like { a: 10, b: 4, c: 400 } and it'll return either "a", "b", or "c", factoring in their respective
// weight. So in this example, "c" is likely to be returned 400 times out of 414
export const getRandomWeightedValue = (options: WeightedOptions) => {
    const keys = Object.keys(options);
    const totalSum = keys.reduce((acc, item) => acc + options[item], 0);

    let runningTotal = 0;
    const cumulativeValues = keys.map((key) => {
        const relativeValue = options[key]/totalSum;
        const cv = {
            key,
            value: relativeValue + runningTotal
        };
        runningTotal += relativeValue;
        return cv;
    });

    const r = Math.random();
    return cumulativeValues.find(({ key, value }) => r <= value)!.key;
};

Tests:

describe('getRandomWeightedValue', () => {
    // Out of 1, the relative and cumulative values for these are:
    //      a: 0.1666   -> 0.16666
    //      b: 0.3333   -> 0.5
    //      c: 0.5      -> 1
    const values = { a: 10, b: 20, c: 30 };

    it('returns appropriate values for particular random value', () => {
        // any random number under 0.166666 should return "a"
        const stub1 = sinon.stub(Math, 'random').returns(0);
        const result1 = randomUtils.getRandomWeightedValue(values);
        expect(result1).toEqual('a');
        stub1.restore();

        const stub2 = sinon.stub(Math, 'random').returns(0.1666);
        const result2 = randomUtils.getRandomWeightedValue(values);
        expect(result2).toEqual('a');
        stub2.restore();

        // any random number between 0.166666 and 0.5 should return "b"
        const stub3 = sinon.stub(Math, 'random').returns(0.17);
        const result3 = randomUtils.getRandomWeightedValue(values);
        expect(result3).toEqual('b');
        stub3.restore();

        const stub4 = sinon.stub(Math, 'random').returns(0.3333);
        const result4 = randomUtils.getRandomWeightedValue(values);
        expect(result4).toEqual('b');
        stub4.restore();

        const stub5 = sinon.stub(Math, 'random').returns(0.5);
        const result5 = randomUtils.getRandomWeightedValue(values);
        expect(result5).toEqual('b');
        stub5.restore();

        // any random number above 0.5 should return "c"
        const stub6 = sinon.stub(Math, 'random').returns(0.500001);
        const result6 = randomUtils.getRandomWeightedValue(values);
        expect(result6).toEqual('c');
        stub6.restore();

        const stub7 = sinon.stub(Math, 'random').returns(1);
        const result7 = randomUtils.getRandomWeightedValue(values);
        expect(result7).toEqual('c');
        stub7.restore();
    });
});
benjamin.keen
  • 1,936
  • 1
  • 20
  • 29
0

I have a slotmachine and I used the code below to generate random numbers. In probabilitiesSlotMachine the keys are the output in the slotmachine, and the values represent the weight.

const probabilitiesSlotMachine         = [{0 : 1000}, {1 : 100}, {2 : 50}, {3 : 30}, {4 : 20}, {5 : 10}, {6 : 5}, {7 : 4}, {8 : 2}, {9 : 1}]
var allSlotMachineResults              = []

probabilitiesSlotMachine.forEach(function(obj, index){
    for (var key in obj){
        for (var loop = 0; loop < obj[key]; loop ++){
            allSlotMachineResults.push(key)
        }
    }
});

Now to generate a random output, I use this code:

const random = allSlotMachineResults[Math.floor(Math.random() * allSlotMachineResults.length)]
J. Doe
  • 12,159
  • 9
  • 60
  • 114
0

Shortest solution in modern JavaScript

Note: all weights need to be integers

function weightedRandom(items){

  let table = Object.entries(items)
    .flatMap(([item, weight]) => Array(item).fill(weight))

  return table[Math.floor(Math.random() * table.length)]
}

const key = weightedRandom({
  "key1": 1,
  "key2": 4,
  "key3": 8
}) // returns e.g. "key1"
0

Enjoy the O(1) (constant time) solution for your problem.

If the input array is small, it can be easily implemented.

    const number = Math.floor(Math.random() * 99); // Generate a random number from 0 to 99
    let element;
    
    if (number >= 0 && number <= 79) {
    /*
        In the range of 0 to 99, every number has equal probability 
        of occurring. Therefore, if you gather 80 numbers (0 to 79) and
        make a "sub-group" of them, then their probabilities will get added.
    
        Hence, what you get is an 80% chance that the number will fall in this 
        range.
    
        So, quite naturally, there is 80% probability that this code will run.

        Now, manually choose / assign element of your array to this variable.

    */
        element = 0;
    }
    
    else if (number >= 80 && number <= 89) {
        // 10% chance that this code runs.
        element = 1;
    }
    
    else if (number >= 90 && number <= 99) {
        // 10% chance that this code runs.
        element = 2;
    }
John Doe
  • 87
  • 1
  • 9
  • 1
    You might profit from [requesting a code review](https://codereview.stackexchange.com/) – Moritz Ringler May 19 '23 at 09:52
  • Feel free to review it and share with me what I might have gotten wrong there, @MoritzRingler – John Doe May 30 '23 at 00:54
  • In a nutshell: The range is off (it's only numbers from 0 to 98). You turn probabilities into what looks like percentages, even though they aren't, which is confusing - why not work with probabilities (i.e. `number < 0.8`)? "number" is a bad name for a variable, "element" should be "index". As a tip, if you need a lot of text to explain what you are doing, you have not taken the intuitive route. Due to the hard-coded values, this code is hard to maintain - if probabilities change you have to adjust several positions. You say the code runs in O(1), but it is O(n) (one `if` for every range). – Moritz Ringler May 30 '23 at 08:22
0

I needed something like @tom roggero solution. So I necro'd this post with some stuff I did which might be useful to someone somewhere.

I modified it to work with any positive, whole max min number, and wrapped it around a tester function, so you can call it to repeat as many times as you like, to test the function.... or call it once (repetition==1) to get a single weighted random number within the max min range.

This will accept any min, max and generate a roughly logarithmic response from min to max. I added a 'minHeavy' boolean, so one can 'flip' the response to be weighted at the max end (by setting this as false) with a small chance of getting the minimum number.

function weightedRandom(min, max, minHeavy) {
    return minHeavy ? Math.round(max / (Math.random() * max + min))
                : (max+1) - Math.round(max / (Math.random() * max + min))
}

function countRandoms(min,max,repetitions,minHeavy) {
    let arr=[];
  for(i=0;i<max-min+1;i++) {arr.push(0)}
    for(i=0;i<repetitions;i++) {
        let x = (min-1)+weightedRandom(1,max-min+1,minHeavy);
    if(repetitions==1) return x;
    for(j=0,k=min;j<(max+1-min);j++,k++) {
        if ( x>=k && x<(k+1) ) arr[j]++;
      }
  }
  return arr;
}

Using this:

console.log(countRandoms(47,69,10000,true));

I got a response:

[3738, 2698, 1152, 614, 411, 266, 180, 164, 140, 97, 88, 78, 59, 58, 52, 41, 26, 32, 26, 28, 18, 29, 5]
GeeDubz
  • 1
  • 1
0
const array = ['react', 'svelte', 'solid', 'qwik']
const weights = [10, 60, 10, 10] //in percentage. Total should be 100
const weightedRandom = (array, weights) => {
  const totalWeight = weights.reduce((a, b) => a + b, 0);
  let random = Math.random() * totalWeight;
  return array.find((_, i) => (random -= weights[i]) <= 0);
};
weightedRandom(array, weights)

Here are results after 1000 iteration

Array element Times
svelte 665
solid 116
react 112
qwik 107
Ankit
  • 11
  • 1
  • 2