Random selection, weighted by rank

Question

Let's say I have a collection of n objects, and that each object has a ranking associated with it, and these rankings correspond to the integer values 1 through n.

Now suppose that I want to select an object at random from the collection. But I don't want to just pick a number from 1 to n at random; instead I want to make it so that I am more likely to pick a number higher up the list (with a ranking closer to 1).

Proposed solution: Instead of picking from 1 to n, pick from 1 to m, where m is some number significantly larger than n; then use some mapping function f : [1,m] → [1,n] that maps more numbers to the higher rankings than to the lower rankings. For instance, f(1), f(2), f(3) might all return 1, while f(m) is the only one which maps to n, so it is three times more likely to get 1 than n. Hopefully that makes sense.

So my question is: If this seems like a reasonable algorithm, what's a reasonable function f that accomplishes this, and what ratio m/n would be large enough that integer rounding doesn't prevent numbers from never getting picked?

[In my particular scenario, n can be pretty large (in the thousands), so solutions like the one presented here aren't very practical for this situation. Also, the selection is "with replacement"; i.e. I pick an object once and then return; I don't care if I pick it again immediately the next time.]

See this answer: http://stackoverflow.com/questions/1761626/weighted-random-numbers/1761646#1761646 — Gabe Moothart, May 03 '11 at 16:00

score 2 · Accepted Answer · answered May 03 '11 at 16:10

2

You can do something like the following:

double bias = 1.5; 
int index = (int)(n * (bias - sqrt(bias*bias -4.0*(bias-1.0)* random()))
                  / 2.0 / (bias-1));

Changing the bias parameter lets you control how strongly you favor the higher ranked individuals.

Edit: Here's some python code for it.

def pick(n, bias):
    return int(n * (bias - sqrt(bias*bias -4.0*(bias-1.0)*random())) / 2.0 / (bias-1))

vals = [0]*10
for i in xrange(1000):
    vals[pick(10,1.5)] += 1
print vals
[153, 151, 115, 116, 97, 96, 87, 69, 66, 50]

answered May 03 '11 at 16:10

deong

3,820
21
18

1

Out of curiosity, how did you come up with this formula? I don't doubt it works, but it's not immediately obvious to me as to *why* it works. – Andrew May 03 '11 at 16:19
1

I didn't. My advisor in grad school had developed an algorithm that needed to do this years ago (1989 to be exact, and the algorithm is a GA called Genitor). I recognized the same requirements and pulled up the code to that algorithm. :) – deong May 03 '11 at 16:25

score 1 · Answer 2 · answered May 03 '11 at 16:00

1

I'd try using the normal random approach (random.uniform(0, 1)), but squaring the probability.

Since P{x} ranges from 0 -> 1, P{x^2}also ranges from0 -> 1`.

But the weight is uneven, as a small number squared is still small, and a larger number squared becomes small.

Just a thought.

answered May 03 '11 at 16:00

Blender

289,723
53
439
496

This is a great idea... in fact, any function x^y would work on the range from 0 to 1 since 0^y=0 and 1^y=1 for all y. – Andrew May 03 '11 at 16:06
You have quite a lot of choices ;) – Blender May 03 '11 at 16:07

score 1 · Answer 3 · answered May 03 '11 at 16:10

I think what you really want is function f : [1,n] → N (the natural numbers 0, 1, 2, ...). This would assign a weighting to each rank. Then you want to pick rank k with probability f(*k*) / (Σ f(*i*)), in other words, that rank's weight over the sum of the weights of all ranks. To do that, you can simply pick an integer uniformly at random over the interval [1, Σ f(*i*)] and figure out what rank you are in based on your position; if you are in 1 ... f(1), pick 1, if you are in f(1)+1 ... f(1)+f(2), pick 2, and so forth.

One possible choice for f which weights small ranks higher than big ranks is f(*i*) = n - i + 1. There are many other possible choices.

Antti Huima · Answer 4 · 2011-05-03T17:34:21.210

1

Generate K random numbers from the interval 1..n (K > 1) and choose the minimum!

This has the properties you want, check out a demonstration of the distributions at http://www.sjsu.edu/faculty/watkins/samplemin.htm

To make this work with fractional values of K (1 < K < 2) you can do it like this:

int m = random_int(1..n)
if (random_double(0..1) <= K - 1):
     m = min(m, random_int(1..n))

So when K approaches 1 from above the distribution becomes more and more flat.

edited May 03 '11 at 17:34

answered May 03 '11 at 17:20

Antti Huima

25,136
3
52
71

Not a bad suggestion, although I think the curve for K = 2 might even be too steep for my needs. – Andrew May 03 '11 at 17:26
Ok, added an explanation how to do it for fractional values 1 < K < 2 :) – Antti Huima May 03 '11 at 17:34

score 0 · Answer 5 · answered May 03 '11 at 15:52

0

Normalize by rank, then build a binary tree. Select a random double and find the corresponding value.

answered May 03 '11 at 15:52

dave

12,406
10
42
59

Random selection, weighted by rank

5 Answers5