2

I need to pick several random items from a weighted set. Items with a higher weight are more likely to get picked. I decided to model this after a lottery. I feel that my solution makes good C++, but I don't think it makes for good python.

What's the pythonic way of doing this?

def _lottery_winners_by_participants_and_ticket_counts(participants_and_ticket_counts, number_of_winners):
    """
    Returns a list of winning participants in a lottery. In this lottery,
    participant can have multiple tickets, and participants can only win
    once.
    participants_and_ticket_counts is a list of (participant, ticket_count)
    number_of_winners is the maximum number of lottery winners
    """

    if len(participants_and_ticket_counts) <= number_of_winners:
        return [p for (p, _) in participants_and_ticket_counts]

    winners = []

    for _ in range(number_of_winners):
        total_tickets = sum(tc for (_, tc) in participants_and_ticket_counts)
        winner = random.randrange(0, total_tickets)

        ticket_count_offset = 0
        for participant_ticket_count in participants_and_ticket_counts:
            (participant, ticket_count) = participant_ticket_count

            if winner < ticket_count + ticket_count_offset:
                winners.append(participant)
                participants_and_ticket_counts.remove(participant_ticket_count)
                break

            ticket_count_offset += ticket_count

    return winners

Edit: Sorry I forgot this earlier, but weight is an integer which could be in the thousands.


Edit: I think I have my final solution based on the comment of @Flo

Notes

  • I'm working in Python 2.7, so I created my own accumulate(). It works differently (and I think better) than accumulate() in Python 3. My version can accumulate from an iterable of tuples based on an add function.

  • I also have special knowledge that participants_and_ticket_counts is a mutable list and will not be used after _lottery_winners_by_participants_and_ticket_counts() is called. That's why I can pop() it.

Here's my solution:

def _lottery_winners_by_participants_and_ticket_counts(participants_and_ticket_counts, number_of_winners):
    """
    Returns a list of winning participants in a lottery. In this lottery,
    participant can have multiple tickets, and participants can only win once.
    participants_and_ticket_counts is a list of (participant, ticket_count)
    number_of_winners is the maximum number of lottery winners
    """
    def _accumulate(iterable, func):
        total = 0
        for element in iterable:
            total = func(total, element)
            yield total

    if len(participants_and_ticket_counts) <= number_of_winners:
        return list(winner for (winner, _) in participants_and_ticket_counts)

    winners = list()
    for _ in range(number_of_winners):
        accumulation = list(_accumulate(participants_and_ticket_counts, lambda total, ptc: total + ptc[1]))
        winning_number = random.randrange(0, accumulation[-1])
        index_of_winner = bisect.bisect(accumulation, winning_number)
        (winner, _) = participants_and_ticket_counts.pop(index_of_winner)
        winners.append(winner)
    return winners

Thanks to everyone for their help!

Jeffery Thomas
  • 42,202
  • 8
  • 92
  • 117
  • 5
    Check this bad boy out. Python has a random generator to which you can assign weights to. http://docs.python.org/py3k/library/random.html#examples-and-recipes – Florin Stingaciu Jun 05 '12 at 15:41
  • Take a look at [this question](http://stackoverflow.com/questions/10868839/given-a-dictionary-of-weights-that-sum-to-1-how-to-use-random-to-select-a-weig/10869010#10869010) (it also has links to older versions) – Lev Levitsky Jun 05 '12 at 15:42
  • 1
    @Flo I think the accumulate()/bisect() method will work for me. Thanks for the help. – Jeffery Thomas Jun 06 '12 at 12:28

3 Answers3

4

numpy.random.choice has a nice solution to this. Here's how you can use it:

>>> import numpy as np
>>> from numpy.random import choice
>>> names = ['Harry', 'Sally', 'Joe', 'Bob', 'Angela', 'Jack', 'Jill', 'Jeff']
>>> weights = [1,4,6,3,5,7,10,14]
>>> p = np.array(weights, dtype=float) / sum(weights)
>>> p
array([ 0.02,  0.08,  0.12,  0.06,  0.1 ,  0.14,  0.2 ,  0.28])

>>> choice(names, size=5, p=p)
array(['Jill', 'Jack', 'Jeff', 'Jeff', 'Angela'], 
      dtype='|S6')
>>> choice(names, size=5, p=p)
array(['Jill', 'Jack', 'Joe', 'Jill', 'Sally'], 
      dtype='|S6')
>>> choice(names, size=5, p=p)
array(['Jack', 'Angela', 'Joe', 'Sally', 'Jill'], 
      dtype='|S6')

However, this function was added in numpy 1.7. If you have an older version, you can just copy the function: http://pastebin.com/F5gti0qJ

jterrace
  • 64,866
  • 22
  • 157
  • 202
2

How's this?

def lottery(participant_and_ticket_count, number_of_winners):
    # Creates list where each person is represented multiple times based on the number of tickets they have.
    population = [person for (person, count) in participant_and_ticket_count for i in range(count)]

    winners = []

    for i in range(number_of_winners):
        try:
            winner = random.choice(population)
        except IndexError:
            # There aren't enough people in the lottery, so return the results early.
            return winners
        winners.append(winner)

        # Remove the winner from the lottery to prevent duplication.
        population = [person for person in population if person != winner]

    return winners

Sample run:

>>> foo = [('Alex', 5),
           ('Betty', 1),
           ('Carl', 2),
           ('Daniella', 10)]
>>> lottery(foo, 2)
['Daniella', 'Alex']
>>> lottery(foo, 2)
['Alex', 'Daniella']
>>> lottery(foo, 2)
['Daniella', 'Betty']
>>> lottery(foo, 9)
['Daniella', 'Alex', 'Carl', 'Betty']
Michael0x2a
  • 58,192
  • 30
  • 175
  • 224
0
>>> from random import shuffle, choice
>>> 
>>> def lottery_winners(players, win_number):
    choosefrom = sum(([name] * count for name, count in players), [])
    shuffle(choosefrom)
    winners = []
    while len(winners) < win_number:
        choice = choosefrom.pop()
        if choice not in winners:
            winners.append(choice)
    return winners

>>> players = [('Alex', 5),
           ('Betty', 1),
           ('Carl', 2),
           ('Daniella', 10)]
>>> lottery_winners(players, 3)
['Alex', 'Carl', 'Daniella']
>>> lottery_winners(players, 3)
['Daniella', 'Alex', 'Carl']
>>> lottery_winners(players, 3)
['Carl', 'Betty', 'Daniella']
>>> lottery_winners(players, 2)
['Alex', 'Daniella']
>>> lottery_winners(players, 2)
['Carl', 'Daniella']
>>> 
Paddy3118
  • 4,704
  • 27
  • 38