Choosing items from a list on a percentual basis

Question

I have a list with around 40 strings and want to assign every item a weight / percentage. During runtime I now want a randomizer to pick an item from the list according to its percentage. Meaning that over a large enough sample size the number of times this item gets picked would correspond to the assigned percentage. A problem that I'm facing is that in the future I might want to extend the list and would then have to assign a new percentage to other items. What would be the best way to save this list and assign weights to individual items?

I can think of some ways to implement this but they are all rather quick & dirty so I was hoping somebody has a design pattern in mind. I'm working in Python but since this is conceptual I'm not really fishing for explicit examples.

Thank you so much for your help.

Does the percentage of each item depend on the rest of the list? It's not exactly clear what you are trying to do. Perhaps one of your ideas would clarify it more? — jmccarthy, Apr 08 '11 at 17:23
Do you need a 'pure' python solution, or could it be based on, for example `scipy` and/or `numpy` (see http://scipy.org/)? Thanks — eat, Apr 08 '11 at 17:24
Yes the percentage of each item would depend on the rest of the list. Ideally I would add a new item with a 'weight' and the weight that I assign to this item gets deducted equally from all other items in the list. But I guess that would be an advanced solution. — Daniel Richter, Apr 08 '11 at 17:30
Pure python is preferred but if a numpy / scipy solution produces significantly better results (without the need to dust off my calculus books) I'd give it a shot — Daniel Richter, Apr 08 '11 at 17:31
I seem to be late (since you accepted already) but, yes `scipy/numpy` would indeed enable you to work very straightforward manner with your `empirical distribution function`. Far less coding and very good performance (if that counts). Thanks — eat, Apr 08 '11 at 18:13
Very interesting eat, I will definitely look into it! Thanks — Daniel Richter, Apr 09 '11 at 08:15

score 3 · Accepted Answer · edited May 23 '17 at 11:47

3

Check out this page: Weighted random generation in Python

Edit: See this also (on SO): A weighted version of random.choice

edited May 23 '17 at 11:47

Community

1
1

answered Apr 08 '11 at 17:26

dusan

9,104
3
35
55

How perfect is that. Thank you dusan! – Daniel Richter Apr 08 '11 at 17:27

score 1 · Answer 2 · answered Apr 08 '11 at 17:29

One way to do it is to use a range as a dictionary key (perhaps as a 2-tuple) and the string as the value. Then you can use random.randint() to generate an integer in the range described by all the dictionary key values. Adding a new string is easy and its range shoves the others' ranges aside (shrinks their weights). If you don't want that to happen, then you have to re-weight everything anyway.

{
    (0,10): "First string",
    (11,50): "Second string",
    (51,73): "Third string"
}

Thank you nmichaels that was the solution I initially thought of. Maybe it's the best approach. — Daniel Richter, Apr 08 '11 at 17:32

Choosing items from a list on a percentual basis

2 Answers2