Weighted probability random choice array

Question

I have an array and returning random values.

const array = [ 1, 2 ,3 ,4 ,5, 6, 7, 8]
const rand = array[~~(Math.random() * array.length)]

I would like to return a random element of the array, but with a weighted probability that higher indexes (indices) are less likely to be returned. i.e 8 is less likely to be returned than 1.

How can I achieve this?.

Does this answer your question? [How to choose a weighted random array element in Javascript?](https://stackoverflow.com/questions/43566019/how-to-choose-a-weighted-random-array-element-in-javascript) — James, Sep 26 '21 at 13:50

score 1 · Accepted Answer · answered Sep 26 '21 at 13:56

1

You can use a trick that clones the origin array to a new array by weighted probability.

You can modify it by:

increase weight on which item you want to show more
decrease weight on which item you want to show less.

You can check the below demo:

const array = [ 1, 2 ,3 ,4 ,5, 6, 7, 8 ]
const weight = [ 8, 7, 6, 5, 4, 3, 2, 1 ];

let randomArray = [];
array.forEach((item, index) => {
   var clone = Array(weight[index]).fill(item);
   randomArray.push(...clone);
});

const result = randomArray[~~(Math.random() * randomArray.length)]

console.log('random value:', result);

answered Sep 26 '21 at 13:56

Tuan Dao

2,647
1
10
20

2

This is extremely inefficient and slow. I tried your method with 15*300000 (=4.5 mil) Elements (15 in `array` and 300000 weight for each element). I ran this several times and it took an average of 410ms - ie nearly half a second! Just for comparison, I ran a binary search with an array with Double the elements (=9.0 mil elements) and it took no longer than 2 ms. Compare 2ms to 410ms. If you give me a few minutes I'll post an answer explaining the method I used – Krokodil Sep 26 '21 at 15:01

score 0 · Answer 2 · answered Sep 26 '21 at 16:27

Here is an efficient method of achieving this. This method uses binary searching (although it is modified to suit your needs).

Here is a summary of how it works:

You represent the probabilites of certain elements getting picked in an array. So if you have the probabilities 50% for "A", 20% for "B", 10% for C, 5% for D, 5% for E, 0.1% for F and 9.9% for G, this would be [.5, .2, .1, .05, .05, .001, .099] in an array. However this is no good because we cannot use it in binary searching as it is not sorted - but if we sort it the probabilities will no longer correspond to our letter-array ([A,B,C,D,E,F,G]). So instead we need to add up each of the probabilities until we end up with 1. Now the probability-array looks like so: [.5, .7, .8, .85, .9, .901, 1]. Now it is sorted, and still corresponds to the array of letters above.
Now we create a random fractional number between 0 and the highest value in the probability array. Math.random() is perfect for that.
Now we look to see which value in the array of probabilities array is closest to this fraction. But there is one catch - the "closest" value cannot be smaller than the fraction.
Once we have the index of this "closest" value, we can use the same index to pick a value from the letters-array. Here is an example in JavaScript:

function find(arr, x , start=0, end=arr.length) {
  if(end < start) return -1;
  else if(end == start) return end;
  const mid = Math.floor((start + end) / 2);  
  if(arr[mid] === x) return mid+1;
  else if(arr[mid] < x) return find(arr, x, mid+1, end);
  else
    return find(arr, x, start, mid);
};


const table_of_corresponding_probabilities = [.5,.7,.8,.85,.9,.901,1];
const values_to_pick_from = ["A", "B", "C", "D", "E", "F", "G"];


function weighted_random_pick(items, weights) {
    return items[find(weights, Math.random())];
};

console.log(weighted_random_pick(values_to_pick_from, table_of_corresponding_probabilities));

So, with these probabilities we should get As 50% of the time, and other letters the rest of the time. Here is a test testing the randomness of the algorithm above:

function find(arr, x , start=0, end=arr.length) {
  if(end < start) return -1;
  else if(end == start) return end;
  const mid = Math.floor((start + end) / 2);  
  if(arr[mid] === x) return mid+1;
  else if(arr[mid] < x) return find(arr, x, mid+1, end);
  else
    return find(arr, x, start, mid);
};
const prob = [.5,.7,.8,.85,.9,.901,1];
const vals = ["A", "B", "C", "D", "E", "F", "G"];
const results = {A:0, B:0, C:0, D:0, E:0, F:0, G:0};
const times_it_ran = 160000;
for(let i = 0; i<times_it_ran; i++) {
    results[vals[find(prob, Math.random())]]++
};
for(letter in results) {
    console.log(letter+":",(results[letter]/(times_it_ran/100)).toFixed(3),"%");
};

When you run the above snippet you should find the percentage of times each letter was picked was close to the intended probability of that letter being picked. Of course it will never be absolutely equal because after all it is random (or at least pseudo-random).

Ok, what about speed and efficiency? Let's test that too:

function find(arr, x , start=0, end=arr.length) {
  if(end < start) return -1;
  else if(end == start) return end;
  const mid = Math.floor((start + end) / 2);  
  if(arr[mid] === x) return mid+1;
  else if(arr[mid] < x) return find(arr, x, mid+1, end);
  else
    return find(arr, x, start, mid);
};
const array_length = 330000;
const probs = Array.apply(null, {length: array_length}).map((x,i) => (i??0)/(array_length-1)); // Note: this way of creating an array means that each value has an equal chance of getting picked but the array is still very long;
const vals = Array.apply(null, {length: array_length}).map(Function.call, String);
const time = func => {
    console.time("timer");
    func();
    console.timeEnd("timer");
};

// Now time the time it takes to search within this LONG array:
function button_click() {
    var x = time(() => {
        vals[find(probs, Math.random())];
    });
};

<button onclick="button_click();">Run test</button>

As you can see, the tests are pretty fast. Mine averaged at about 2ms. However this only searches in an array of length 3.3e5. This is the value I have chosen because otherwise I get a range error (limitation of built-in function Array.apply). So here I have done the same test, but used a different method of generating massive arrays (a for loop... i know it's probably the worst method but it does the job).

function find(arr, x , start=0, end=arr.length) {
  if(end < start) return -1;
  else if(end == start) return end;
  const mid = Math.floor((start + end) / 2);  
  if(arr[mid] === x) return mid+1;
  else if(arr[mid] < x) return find(arr, x, mid+1, end);
  else
    return find(arr, x, start, mid);
};


const len = 75e6; // 75 million elements in this array!

let probs = [];
for(let i = 0; i < 1; i+=(1/len)) {
    probs.push(i);
};

const time = func => {
    console.time("timer");
    func();
    console.timeEnd("timer");
};

// Now time the time it takes to search within this LONG array:
function button_click() {
    var x = time(() => {
        find(probs, Math.random());
    });
};

<button onclick="button_click();">Run test</button>

So after running this test with 75 million elements, what do we find? The first test is marginally slower than the tests we previously ran (with 3.3e5 elements), and the rest average at about 2ms to 2.25ms . So this is (2+2.25)/2 - avg time from last tests = 2.125-2 = 0.125 0.125ms slower than searching with an array with 227 TIMES less elements. That is the extent to which binary search is efficient. Actually I'd like to suggest that a part of that 0.125ms delay could be due to the fact that the CPU cores are very hot due to that bad method of building an array. Yes, I'm talking about the 75 million iterations we had to complete just to create that array!

Hope you found the efficiency helpful! If you would like to use this algorithm just use the first snippet I gave you, everything is much more readable there than in the last few snippets.

Weighted probability random choice array

2 Answers2

Here is a summary of how it works:

Linked

Related