I need to pick several random items from a weighted set. Items with a higher weight are more likely to get picked. I decided to model this after a lottery. I feel that my solution makes good C++, but I don't think it makes for good python.
What's the pythonic way of doing this?
def _lottery_winners_by_participants_and_ticket_counts(participants_and_ticket_counts, number_of_winners):
"""
Returns a list of winning participants in a lottery. In this lottery,
participant can have multiple tickets, and participants can only win
once.
participants_and_ticket_counts is a list of (participant, ticket_count)
number_of_winners is the maximum number of lottery winners
"""
if len(participants_and_ticket_counts) <= number_of_winners:
return [p for (p, _) in participants_and_ticket_counts]
winners = []
for _ in range(number_of_winners):
total_tickets = sum(tc for (_, tc) in participants_and_ticket_counts)
winner = random.randrange(0, total_tickets)
ticket_count_offset = 0
for participant_ticket_count in participants_and_ticket_counts:
(participant, ticket_count) = participant_ticket_count
if winner < ticket_count + ticket_count_offset:
winners.append(participant)
participants_and_ticket_counts.remove(participant_ticket_count)
break
ticket_count_offset += ticket_count
return winners
Edit: Sorry I forgot this earlier, but weight is an integer which could be in the thousands.
Edit: I think I have my final solution based on the comment of @Flo
Notes
I'm working in Python 2.7, so I created my own accumulate(). It works differently (and I think better) than accumulate() in Python 3. My version can accumulate from an iterable of tuples based on an add function.
I also have special knowledge that participants_and_ticket_counts is a mutable list and will not be used after _lottery_winners_by_participants_and_ticket_counts() is called. That's why I can pop() it.
Here's my solution:
def _lottery_winners_by_participants_and_ticket_counts(participants_and_ticket_counts, number_of_winners):
"""
Returns a list of winning participants in a lottery. In this lottery,
participant can have multiple tickets, and participants can only win once.
participants_and_ticket_counts is a list of (participant, ticket_count)
number_of_winners is the maximum number of lottery winners
"""
def _accumulate(iterable, func):
total = 0
for element in iterable:
total = func(total, element)
yield total
if len(participants_and_ticket_counts) <= number_of_winners:
return list(winner for (winner, _) in participants_and_ticket_counts)
winners = list()
for _ in range(number_of_winners):
accumulation = list(_accumulate(participants_and_ticket_counts, lambda total, ptc: total + ptc[1]))
winning_number = random.randrange(0, accumulation[-1])
index_of_winner = bisect.bisect(accumulation, winning_number)
(winner, _) = participants_and_ticket_counts.pop(index_of_winner)
winners.append(winner)
return winners
Thanks to everyone for their help!