9

I have a list of approx. 10000 items. The current situation is that every item has an associated weight (priority or importance). Now the smallest weight is -100 (negative and zero values can be removed) and the highest weight is 1500. Weight is determined by intuition by people (how somebody thinks the item is important to community). Because it's not easy to determine the most important item, I'd like to use some random factor, so that items with lower weight will have less chance to be chosen and their weight will be adjusted in the future (some mix of common sense and randomness).

Do you know how to code a function getItem?

def getItem(dict):
  # this function should return random item from 
  # the dictionary of item-weight pairs (or list of tuples)
  # Normally I would return only random item from the dictionary,
  # but now I'd like to have this: The item with weight 1500 should
  # have much more chance to be returned than the item with weight 10.
  # What's my idea is to sum up the weights of all items and then compute
  # some ratios. But maybe you have better idea.
  return randomItem

Thank you

xralf
  • 3,312
  • 45
  • 129
  • 200

5 Answers5

14

Have a look at this, i think it's what you need with some nice comparision between different methods Weighted random generation in Python

The simplest approach suggested is:

import random

def weighted_choice(weights):
    totals = []
    running_total = 0

    for w in weights:
        running_total += w
        totals.append(running_total)

    rnd = random.random() * running_total
    for i, total in enumerate(totals):
        if rnd < total:
            return i

You can find more details and possible improvements as well as some different approaches in the link above.

KL-7
  • 46,000
  • 9
  • 87
  • 74
Bogdan
  • 8,017
  • 6
  • 48
  • 64
  • [Answers on SO should be self-contained](http://meta.stackexchange.com/questions/18669/should-posts-be-self-contained), so please consider incorporating the essence of the linked article in your answer. – Sven Marnach Feb 13 '12 at 12:36
  • `weights` needs to be sorted? – critrange Mar 03 '22 at 15:10
10

Python 3.6 introduced random.choices()

def get_item(items, items_weights):
    return random.choices(items, weights=items_weights)[0]
Will Da Silva
  • 6,386
  • 2
  • 27
  • 52
m.elahi
  • 691
  • 6
  • 9
3

You should extract a random number between 0 and the sum of weights (positive by definition). Then you get the item from a list by using bisect: http://docs.python.org/library/bisect.html (the bisect standard moduke).

import random 
import bisect
weight = {'a':0.3,'b':3.2,'c':2.4}
items = weight.keys()
mysum = 0
breakpoints = [] 
for i in items:
    mysum += weight[i]
    breakpoints.append(mysum)

def getitem(breakpoints,items):
    score = random.random() * breakpoints[-1]
    i = bisect.bisect(breakpoints, score)
    return items[i] 

print getitem(breakpoints,items)
jimifiki
  • 5,377
  • 2
  • 34
  • 60
2

It's easier to do if the weights are not negative. If you have to have negative weights, you'll have to offset the weights by the lowest possible weight. In your case, offsetted_weight = itemweight + 100

In pseudocode, it goes like this:

Calculate the sum of all the weights.
Do a random from 0 to the sum of the weights
Set i to 0
While the random number > 0
    Subtract the weight of the item at index i  from random
    If the random number is < 0 return item[i]
    Add 1 to i
Dervall
  • 5,736
  • 3
  • 25
  • 48
  • Negative weights came into existence at the beginning when every item had weight 1. But it's not necessary to have negative weights, I can remove negative. – xralf Feb 13 '12 at 12:19
-2

If you're storing your data in a database, you can use SQL:

SELECT * FROM table ORDER BY weight*random() DESC LIMIT 1
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
groovekiller
  • 1,122
  • 2
  • 8
  • 20
  • Neat, but it's SQL and the question is marked with python tag. Anyway, I like the idea. – KL-7 Feb 13 '12 at 11:56
  • @copperttim Actually I like it because I use `sql` and your solution looks quite good and usable at the first sight. – xralf Feb 13 '12 at 12:28
  • By Your description i thought that You're using sql. Hope it works as u need. – groovekiller Feb 13 '12 at 12:33
  • I removed my downvote, but your answer seems to have the same flaw as mine (as pointed out by @SvenMarnach). – Tim Pietzcker Feb 13 '12 at 13:35
  • I see it too, but the answers with flaws are good here for learning. It's pity to delete them. The `warning` before it would be sufficient. – xralf Feb 13 '12 at 13:52