3

I want to choose a random element (uniform distribution) from a python set in O(1) time. Is this possible? I've seen it suggested that one first convert the set to a list then select the random element from the list however this will take O(n) time where n is the size of the set. If this is not possible, what is a reasonably fast alternative?

martineau
  • 119,623
  • 25
  • 170
  • 301
Mathew
  • 1,116
  • 5
  • 27
  • 59
  • 1
    You can use a dict and keep the keys and vals the same, then call `random.choice` on that. What's your use case--is a set required? As far as I know, I don't think there's a way to do better than O(n) but maybe I'm mistaken since intuitively it seems like it should be no problem. – ggorlen Sep 13 '20 at 23:21
  • I don't think you can randomly select from a set in O(1). Using a list seems like a reasonable approach. Unless your set is huge, it doesn't seem like it should be a problem. – khelwood Sep 13 '20 at 23:40
  • 2
    [Time complexity](https://wiki.python.org/moin/TimeComplexity), from the Python wiki – jsmart Sep 14 '20 at 00:02
  • 1
    If you really need a set data-structure, there are a number of third-party `orderedset` implementations available — see [PyPi](https://pypi.org), and it should be possible to remove a random element from one that supports indexing. Also see [Does Python have an ordered set?](https://stackoverflow.com/questions/1653970/does-python-have-an-ordered-set) – martineau Sep 14 '20 at 00:12
  • Are you trying to have one data structure that exposes set-like operations as well as a `random()` one? – Alexandru Barbarosie Sep 14 '20 at 00:29

1 Answers1

3

I think it is impossible to get a random element from a hashtable (how a set is implemented) because it does not support random access. random.choice needs random access for this reason. You have a good alternative in set.pop, but it doesn't appear to be uniform (see https://github.com/python/cpython/blob/master/Objects/setobject.c#L616).

As long as it's not performance critical in a tight loop, converting to a list should be fine. However, if it does matter a lot, maybe you can consider using a different data structure in the first place.

iz_
  • 15,923
  • 3
  • 25
  • 40