1

I was wondering which of shuffle() and choice() was more efficient. For example, I have a N element list long_list and need a random item from the list, is it quicker to do:

shuffle(long_list)
return long_list[0]

or

return choice(long_list)
Alan Kavanagh
  • 9,425
  • 7
  • 41
  • 65
Ollu_
  • 101
  • 1
  • 10
  • 1
    [Based on the code](https://hg.python.org/cpython/file/3.5/Lib/random.py#l259) it looks like `random.choice()` – Alan Kavanagh Mar 03 '18 at 19:44
  • @AK47 - depends on the situation. if you need consecutive 1000 randoms *somewhen* I would shuffle the whole list and yield myself a new value whenever I need one. If I need only one, I choose choice. Very cool link though. – Patrick Artner Mar 03 '18 at 19:45
  • `random.choice()` will always generate a random item in constant time according to [the comments/answers on this question](https://stackoverflow.com/questions/40143157/q-what-is-big-o-complexity-of-random-choicelist-in-python3?rq=1).. so even if you shuffle the list and then choose items, the shuffle will add extra time and all of the choices will be constant time? – Alan Kavanagh Mar 03 '18 at 19:48
  • @AK47 I would not _choose_ one, I would yield one whenever I need one :) thats constant as well. And I would not get any duplicates from it. 10 times choice might get me 10 times the same thing (if youre very very unlucky) – Patrick Artner Mar 03 '18 at 19:48
  • If the list is large, then `random.shuffle()` will take a lot of extra time to execute - and then all of the lookups will be constant. So `random.choice()` will always be faster on it's own, however, for smaller lists this probably wont be noticible – Alan Kavanagh Mar 03 '18 at 19:53

2 Answers2

4

random.choice() will always be faster as it does lookups in constant time O(1)

random.shuffle() requires iterating over the full sequence to execute the shuffle and this would be O(n), and then your lookup which is constant time O(1)

The source code is available to read here

Alan Kavanagh
  • 9,425
  • 7
  • 41
  • 65
  • The official code repository for CPython these days is [on github](https://github.com/python/cpython). The site you linked to is the older mercurial repository, which doesn't appear to be getting updates any more. – Blckknght Mar 03 '18 at 21:04
3

It will surely be faster to use random.choice.

You can figure this out pretty easily by looking at the code. (The random module in cpython is mostly implemented in Python, only the low-level random number generation is in C.) Here are the relevant parts, which happen to be right next to each other:

def choice(self, seq):
    """Choose a random element from a non-empty sequence."""
    try:
        i = self._randbelow(len(seq))
    except ValueError:
        raise IndexError('Cannot choose from an empty sequence') from None
    return seq[i]

def shuffle(self, x, random=None):
    """Shuffle list x in place, and return None.
    Optional argument random is a 0-argument function returning a
    random float in [0.0, 1.0); if it is the default None, the
    standard random.random will be used.
    """

    if random is None:
        randbelow = self._randbelow
        for i in reversed(range(1, len(x))):
            # pick an element in x[:i+1] with which to exchange x[i]
            j = randbelow(i+1)
            x[i], x[j] = x[j], x[i]
    else:
        _int = int
        for i in reversed(range(1, len(x))):
            # pick an element in x[:i+1] with which to exchange x[i]
            j = _int(random() * (i+1))
            x[i], x[j] = x[j], x[i]

The choice method only generates one random number, and uses it to index into the sequence it was given. The shuffle method, on the other hand, loops over the length of the sequence and swaps elements around as it goes. So choice will take O(1) time, while shuffle takes O(N) time.

In most cases, if you just need one value, you want to use random.choice. If you want several non-repeating selections, use random.sample. Only if you're going to use all the values should you use random.shuffle. Note that even if you do expect to select many items, you should still use random.choice if you want it to be possible for the same item to be picked more than once. Sampling and shuffling will never repeat items (unless the items were repeated in the input list).

Blckknght
  • 100,903
  • 11
  • 120
  • 169