143

I'm working on an AI portion of a guessing game. I want the AI to select a random letter from this list. I'm doing it as a set so I can easily remove letters from the list as they are guessed in the game and are therefore no longer available to be guessed again.

it says set object isn't indexable. How can I work around this?

import random 
aiTurn=True

while aiTurn == True:
    allLetters = set(list('abcdefghijklmnopqrstuvwxyz'))
    aiGuess=random.choice(allLetters)



    print (aiGuess) 
Alec
  • 8,529
  • 8
  • 37
  • 63
jamyn
  • 1,683
  • 3
  • 15
  • 13
  • 4
    Incidentally you don't need to use set(list('string')) to get a set of letters since strings are iterable by themselves -- set('abc') will do what you want. – Scott Ritchie Dec 29 '14 at 16:42
  • 9
    For others encountering this problem, it's worth looking at this question about how to create a set-like object which allows efficient random selection. The options given here are all O(N). http://stackoverflow.com/q/15993447/2966723 – Joel Aug 18 '16 at 02:34

6 Answers6

145

Note (Oct. 2020): as of v3.9, Python has officially deprecated random.sample() working on sets, with the official guidance being to explicitly convert the set to a list or tuple before passing it in, though this doesn't solve the efficiency problems.


>>> random.sample(set('abcdefghijklmnopqrstuvwxyz'), 1)
['f']

Documentation: https://docs.python.org/3/library/random.html#random.sample

Note that choosing random elements from a set is extremely inefficient no matter how you do it - it takes time proportional to the size of the set, or worse if the set's underlying hash table is sparse due to removed elements.

Instead, you should probably use a different data structure that supports this operation efficiently.

user2357112
  • 260,549
  • 28
  • 431
  • 505
NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • 12
    Tack on a `[0]` at the end so it's basically identical to `random.choice` (which doesn't return it's values in the form of a list) – Nick T Feb 19 '14 at 20:57
  • 37
    `random.sample` does `tuple(population)` internally, so `random.choice(tuple(allLetters))` may be better. – utapyngo Apr 15 '14 at 12:42
  • 30
    It should be highlighted that this process is O(N). – Joel Aug 18 '16 at 02:33
  • @Joel Why is this process is O(N)? – ManuelSchneid3r Mar 14 '17 at 19:25
  • @ManuelSchneid3r 2 reasons. 1) `random.sample` converts the set into a list or tuple and then randomly selects an element from it, so to convert a set to a list takes O(N) operations. This occurs any time you call it with a set. 2) in this particular case, since this starts from a string, converting it to a set also takes O(N). – Joel Mar 14 '17 at 20:00
  • 3
    I think it's really inefficient... As you can see https://github.com/python/cpython/blob/2.7/Lib/random.py#L332-L339 the sample function creates a list from the set every time you make the above call and takes a random element from it. Suppose you have a large set and you want to make a lot of samples. If the set doesn't change it's better to convert it to a list and use `random.choice`. If the set also changes while you sample it then probably you shouldn't use a set at all. If you would know the occupied hashes in the set and the bucket sizes it would be easy to write a sampling function... – jakab922 Mar 21 '17 at 16:56
  • 3
    Note that as of 3.9 Python has [officially deprecated](https://docs.python.org/3.9/library/random.html#random.sample) `random.sample()` working on sets, with the official guidance being to explicitly convert the set to a list or tuple before passing it in. – Amber Dec 08 '20 at 01:52
  • Sampling from a set is not "inefficient no matter how you do it". Python sets use linear probing, so you can simply repeatedly pick random positions in the array until you find one that is set. This completes in expected constant time. – Thomas Ahle Mar 05 '21 at 11:37
  • [This issue](https://bugs.python.org/issue40325) led to the mentioned deprecation, if you're interested in details. – Jeyekomon Mar 03 '22 at 15:29
76

You should use random.choice(tuple(myset)), because it's faster and arguably cleaner looking than random.sample. I wrote the following to test:

import random
import timeit

bigset = set(random.uniform(0,10000) for x in range(10000))

def choose():
    random.choice(tuple(bigset))

def sample():
    random.sample(bigset,1)[0]

print("random.choice:", timeit.timeit(choose, setup="global bigset", number=10000)) # 1.1082136780023575
print("random.sample:", timeit.timeit(sample, setup="global bigset", number=10000)) # 1.1889629259821959

From the numbers it seems that random.sample takes 7% longer.

a_guest
  • 34,165
  • 12
  • 64
  • 118
Scott Ritchie
  • 1,786
  • 1
  • 13
  • 12
  • 3
    On my machine, random.choice is 7 times faster. – noɥʇʎԀʎzɐɹƆ Jun 19 '16 at 19:42
  • 7
    There's no way to select directly from set, without having to copy it into tuple? – Youda008 Dec 19 '17 at 09:20
  • I get sample being roughly 12% (250 ms) slower than choose on a set of 5000 elements. – Simon Feb 03 '18 at 11:55
  • 1
    On my machine, `random.sample` goes from being slower than `random.choice` to being faster than it as the set size grows (the crossover point is somewhere between set size 100k-500k). That is, the larger the set, the more likely it is for `random.sample` to be faster. – jakee Jan 31 '19 at 16:11
1

Since the choice list is not very long, you can use random.shuffle the list first. Then iterate each element from the list. This avoid removing element from a list one by one and make your code cleaner.

0

You could work around this by using a list instead of a set. You will still be able to remove letters "easily" from the list. Try this, for example:

allLetters = list('abcdefghijklmnopqrstuvwxyz')
aiGuess = random.choice(allLetters)
allLetters.remove(aiGuess)

Another option is to randomly choose the index instead of the letter, which might be slightly faster since we don't need to search for the element to delete (but I'd question if speed actually matters here?):

allLetters = list('abcdefghijklmnopqrstuvwxyz')
index = random.randint(0, len(allLetters)-1) # Top is inclusive, unlike slices
aiGuess = allLetters[index]
del allLetters[index]
Moot
  • 2,195
  • 2
  • 17
  • 14
-1

You can combine a doubly-linked list and a dictionary, to create a set with O(1) random choice.

serg06
  • 2,027
  • 1
  • 18
  • 26
-4

If you want to get a random element from the set.

a = set()
for i in range(10):
    a.add(i)
a.pop() // gives a random element from a set
Sai
  • 25
  • 1
  • You need to add it back tho, `sample = a.pop()` then `a.add(sample)` – rj487 Nov 14 '21 at 22:47
  • 8
    This does not give random element from the set. – BrainDead Feb 09 '22 at 15:32
  • Yes it does. From the API: https://docs.python.org/3/c-api/set.html#c.PySet_Pop "Return a new reference to an arbitrary object in the set, and removes the object from the set. Return NULL on failure. Raise KeyError if the set is empty. Raise a SystemError if set is not an instance of set or its subtype." I'm assuming your beef is that there's no guarantees of pseudo-indeterminism, which you should specify. It's a perfectly good API method that will fulfill "random" requirement for various usecases. – user3724404 Mar 26 '22 at 21:11
  • 6
    @user3724404 You cannot assume 'arbitrary' as 'random'. In earlier version of python, the order of pop is definite, in recent python version, the set object(also for dict) is insertion ordered. in both case, the result of pop is predictable, thus not random. I can be 100% sure the popped element in your answer is 0. – imkzh Apr 13 '22 at 07:58