1

I have a dictionary that looks like this.

mychoice = {0.7: 2, 0.2: 1, 0.1:3}

I will use the following to select which value to use. In the above, value 2 will be selected 70% of the time and value 1 will be selected 20% of the time and 3, 10% of the time.

What is the best method to use the following to generate a random number between 0 and 1 and to select randomly the value to use?

from random import random
ran = random()
if ran>.10 and <.30 then select value 1 with a key of .20

Thanks

Tampa
  • 75,446
  • 119
  • 278
  • 425
  • 3
    A dictionary isn't an appropriate data structure for this because it prevents you from assigning the same probability to two values. – Karl Knechtel Jun 03 '12 at 08:18
  • 2
    @KarlKnechtel I'd say a dict is okay, but the "weights" should be **values**, not **keys**. Anyway, good point. – Lev Levitsky Jun 03 '12 at 08:19
  • Similar to http://stackoverflow.com/q/3679694/222914 with a good answer in http://stackoverflow.com/a/4322940/222914 – Janne Karila Jun 03 '12 at 08:35

5 Answers5

4

Taking your example, with some modifications (swap key/value in the dict):

mychoice = {1: 0.2, 2: 0.7, 3:0.1}
current = 0
limits = {}

for key in mychoice:
    limits[key] = (current,current + mychoice[key])
    current = current + mychoice[key] #Next range should start at the end of current

#This should give a new dictionary: {1:(0,0.2),2:(0.2,0.9),3;(0.9,1)}

r = random.random() # float between 0 and 1

for key in limits:
    range = limits[key]
    if r >= range[0] and r < range[1]:
          return key
return None

This can be optimized, but you get the idea.

Samy Arous
  • 6,794
  • 13
  • 20
2

The first thing that comes to my mind: sort them and add them up.

Let's assume you've followed my advice and changed the structure of your dict like this:

mychoice = {2: 0.7, 1: 0.2, 3: 0.1}

Let's build a dict with accumulated weights:

temp = sorted(((v, w) for v, w in mychoice.items()), key = lambda x: x[1], reverse = True)
accum = [(val[0], sum(_[1] for _ in temp[:i+1])) for i, val in enumerate(temp)]

(that's a little messy, can someone optimize?)

Anyway, now you have accum as [(2, 0.7), (1, 0.9), (3, 1)]

So:

r = random.random()

for vw in accum:
    if vw[1] > r:
        print vw[0]
        break

EDIT: As astynax cleverly points out, there's no need to sort the weights, as the list of accumulated probabilities will be sorted anyway.

So we only need:

accum = ((k, sum(mychoice.values()[:i]))
    for i, k in enumerate(mychoice.keys(), 1))

And then generate a random value and get the result the same way as before.

Lev Levitsky
  • 63,701
  • 20
  • 147
  • 175
  • I probably should have looked at the link Janne Karila posted in the [comment](http://stackoverflow.com/questions/10868839/given-a-dictionary-of-weights-that-sum-to-1-how-to-use-random-to-select-a-weig#comment14161298_10868839). It's essentially the same answer. – Lev Levitsky Jun 03 '12 at 08:54
1
>>> d = {0.7: 2, 0.2: 1, 0.1:3}
>>> keys = [[k] * int(round(10*k)) for k in d.keys()]
>>> keys
[[0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7], [0.1], [0.2, 0.2]]
>>> import itertools
>>> keys = list(itertools.chain(*keys))
[0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.1, 0.2, 0.2]
>>> import random
>>> d[random.choice(keys)]
2
>>> d[random.choice(keys)]
2
>>> d[random.choice(keys)]
3

Alternative: To express probability of selection to a resolution of, say, 1 in a 1000:

>>> keys = [[k] * int(round(1000*k)) for k in d.keys()]
daedalus
  • 10,873
  • 5
  • 50
  • 71
0

This is a nice way to do it using numpys digitize and accumulate:

from random import random
import numpy as np

mychoice = {0.7: 2, 0.2: 1, 0.1: 3}

bins = np.add.accumulate(mychoice.keys())
for i in xrange(100):
    print mychoice.values()[np.digitize([random()], bins)[0]],

#Output:
1 2 3 2 2 2 2 2 2 1 1 3 3 2 2 2 2 2 2 2 1 3 2 2 3 2 1 2 1 2 2 2 2 2
1 2 2 2 2 3 3 2 1 1 2 2 1 1 3 2 2 2 2 2 1 2 2 2 1 1 1 2 2 2 2 2 2 2 
3 2 1 2 2 2 3 1 1 1 2 2 2 2 2 2 3 2 1 2 2 2 2 1 1 2 1 2 2 2 2 1 

As @Karl Knechtel points out a dict is not a suitable structure for doing this as you can't have repeated weights, but we'll use this as a starting point regardless. How to do it:

  1. First create bins using accumulate (using bins allows you to use repeated weights).
  2. Then we use digitize to see which bins random numbers fall into, and use this index for mychoice.values() (although mychoice is a dict the keys and values retain their order provided there are no insertions or deletions..).
fraxel
  • 34,470
  • 11
  • 98
  • 102
0

With d = {2: 0.7, 1: 0.2, 3: 0.1} which is more logical (different choices and their respective weights, which can repeat), you can use this random_weighter function, which accepts also weights that do not sum to 1.0.

import random

def random_weighted(d):
    r = random.random() * sum(d.itervalues())
    for k, v in d.iteritems():
        r = r - v
        if r <= 0.:
            return k 

d = {2: 0.7, 1: 0.2, 3: 0.1}
for i in xrange(10):
    print random_weighted(d),

prints (for example):

3 1 2 2 2 2 2 2 3 2
eumiro
  • 207,213
  • 34
  • 299
  • 261