116

Say I have a list x with unkown length from which I want to randomly pop one element so that the list does not contain the element afterwards. What is the most pythonic way to do this?

I can do it using a rather unhandy combincation of pop, random.randint, and len, and would like to see shorter or nicer solutions:

import random
x = [1,2,3,4,5,6]
x.pop(random.randint(0,len(x)-1))

What I am trying to achieve is consecutively pop random elements from a list. (i.e., randomly pop one element and move it to a dictionary, randomly pop another element and move it to another dictionary, ...)

Note that I am using Python 2.6 and did not find any solutions via the search function.

4b0
  • 21,981
  • 30
  • 95
  • 142
Henrik
  • 14,202
  • 10
  • 68
  • 91
  • 4
    I'm not much of a Pythonista, but that sure looks pretty good to me. – Matt Ball Apr 06 '12 at 19:12
  • a detailed time complexity analysis has been performed by me, see my answer somewhere down the road. SHUFFLE is NOT EFFICIENT ! but you can still use if you need to change order of items somehow. if pop(0) concerns you, use dequeue, mentioned in my analysis. – nikhil swami Nov 12 '20 at 00:12
  • O(2) time complexity for the answer ive written. wrap it in a function for quick use. please note that any list.pop(n) other than list.pop(-1) takes O(n). – nikhil swami Nov 13 '20 at 21:57

9 Answers9

130

What you seem to be up to doesn't look very Pythonic in the first place. You shouldn't remove stuff from the middle of a list, because lists are implemented as arrays in all Python implementations I know of, so this is an O(n) operation.

If you really need this functionality as part of an algorithm, you should check out a data structure like the blist that supports efficient deletion from the middle.

In pure Python, what you can do if you don't need access to the remaining elements is just shuffle the list first and then iterate over it:

lst = [1,2,3]
random.shuffle(lst)
for x in lst:
  # ...

If you really need the remainder (which is a bit of a code smell, IMHO), at least you can pop() from the end of the list now (which is fast!):

while lst:
  x = lst.pop()
  # do something with the element      

In general, you can often express your programs more elegantly if you use a more functional style, instead of mutating state (like you do with the list).

Niklas B.
  • 92,950
  • 18
  • 194
  • 224
  • 4
    So a better (faster) idea would be to use `random.shuffle(x)` and then `x.pop()`? I do not understand how to do this "functional"? – Henrik Apr 06 '12 at 19:21
  • @Henrik: I don't know what you're trying to do, so I can't really tell. You should add more information to the question, or just comment here what you want to achieve :) This seems to be a case of the [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem)... – Niklas B. Apr 06 '12 at 19:22
  • I have a list of elements from which I want to consecutively pop random elements. – Henrik Apr 06 '12 at 19:22
  • @Henrik: But why? Is the remaining list important or only the popped item? We need more context to give good advise ;) If you only need the popped elements and the remaining list is not needed, you should just use `random.shuffle(x)` and `x.pop()` afterwards (which is fast) – Niklas B. Apr 06 '12 at 19:23
  • I need the element for something (in another dictionary), then I need another random element (but not the first one) to move it to another dictionary, and so on. – Henrik Apr 06 '12 at 19:26
  • @Henrik: Please have a look at my edit and tell me if that's what you're trying to do. Your comment is still a bit unclear to me :) – Niklas B. Apr 06 '12 at 19:28
  • Thanks. I think the idea with `shuffle` and `pop` is what I actually need. – Henrik Apr 06 '12 at 19:31
  • @Henrik: Probably it would be better to just iterate over the list after shuffling it. By the way, the `zip` stuff is what I consider functional style (and seems quite appropriate in your case). – Niklas B. Apr 06 '12 at 19:32
  • Sorry. You are right. What do you mean with `zip` stuff? Never heard of that. – Henrik Apr 06 '12 at 19:37
  • 1
    @Henrik: If you have two collections (for example, a list of dictionaries and a list of random numbers) and you want to iterate them at the same time, you can `zip` them to get a list of (dict, number) pairs. You said something about multiple dictionaries of which you want to associate each with a random number. `zip` is perfect for this – Niklas B. Apr 06 '12 at 19:48
  • 4
    I'm supposed to add a post when I down-vote. There are times when you need to remove an item from the middle of a list...I have to do it right now. No choice: I have an ordered list, I have to remove an item in the middle. It sucks, but the only other choice is to do a heavy code refactoring for one semi-rare operation. The issue is one of implementation of [ ], which SHOULD be efficient for such operations, but is not. – Mark Gerolimatos Nov 30 '16 at 17:58
  • @MarkGerolimatos Your frustration with Python should have nothing to do with how you assess the quality of this post. Sure, if you *need* to remove something from the middle of an array, you can do it, and whether it is a performance bottleneck or not depends on a whole bunch of factors. This answer obviously does not apply if there is literally no other way to solve a task than to pop from the middle of an array. – Niklas B. Nov 30 '16 at 19:47
  • 6
    @NiklasB. The OP was using random as an example (frankly, it should have been left off, it clouded the issue). "Don't do that" is insufficient. A better answer would have been to suggest a Python data structure that DOES support such operations while providing SUFFICIENT access speed (clearly not as good as arra...er...list). In python 2, I could not find one. If I do, I will answer with that. Note that due to a browser mishap, I was unable to add that in to my original comment, I should have added a secondary comment. Thank you for keeping me honest :) – Mark Gerolimatos Nov 30 '16 at 20:36
  • 1
    @MarkGerolimatos There is no data structure with both efficient random access and insert/delete in the standard library. You probably want to use something like https://pypi.python.org/pypi/blist I would still argue that in a lot of use cases this can be avoided – Niklas B. Nov 30 '16 at 20:50
  • @MarkGerolimatos `del L[i]` works for ordinary lists too. It is ˋO(n)ˋ operation (slow for large lists). Measure, if it matters in your case. You can [remove by value too](https://stackoverflow.com/a/2794519/4279). Beware, [there may be issues with removing an item while iterating over the list at the same time](https://stackoverflow.com/questions/6260089/strange-result-when-removing-item-from-a-list-while-iterating-over-it) – jfs Apr 15 '23 at 03:21
65

You won't get much better than that, but here is a slight improvement:

x.pop(random.randrange(len(x)))

Documentation on random.randrange():

random.randrange([start], stop[, step])
Return a randomly selected element from range(start, stop, step). This is equivalent to choice(range(start, stop, step)), but doesn’t actually build a range object.

Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
21

To remove a single element at random index from a list if the order of the rest of list elements doesn't matter:

import random

L = [1,2,3,4,5,6]
i = random.randrange(len(L)) # get random index
L[i], L[-1] = L[-1], L[i]    # swap with the last element
x = L.pop()                  # pop last element O(1)

The swap is used to avoid O(n) behavior on deletion from a middle of a list.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
13

despite many answers suggesting use random.shuffle(x) and x.pop() its very slow on large data. and time required on a list of 10000 elements took about 6 seconds when shuffle is enabled. when shuffle is disabled speed was 0.2s

the fastest method after testing all the given methods above was turned out to be written by @jfs

import random

L = [1,"2",[3],(4),{5:"6"},'etc'] #you can take mixed or pure list
i = random.randrange(len(L)) # get random index
L[i], L[-1] = L[-1], L[i]    # swap with the last element
x = L.pop()                  # pop last element O(1)

in support of my claim here is the time complexity chart from this source enter image description here


IF there are no duplicates in list,

you can achieve your purpose using sets too. once list made into set duplicates will be removed. remove by value and remove random cost O(1), ie very effecient. this is the cleanest method i could come up with.

L=set([1,2,3,4,5,6...]) #directly input the list to inbuilt function set()
while 1:
    r=L.pop()
    #do something with r , r is random element of initial list L.

Unlike lists which support A+B option, sets also support A-B (A minus B) along with A+B (A union B)and A.intersection(B,C,D). super useful when you want to perform logical operations on the data.


OPTIONAL

IF you want speed when operations performed on head and tail of list, use python dequeue (double ended queue) in support of my claim here is the image. an image is thousand words.

enter image description here

nikhil swami
  • 2,360
  • 5
  • 15
10

Here's another alternative: why don't you shuffle the list first, and then start popping elements of it until no more elements remain? like this:

import random

x = [1,2,3,4,5,6]
random.shuffle(x)

while x:
    p = x.pop()
    # do your stuff with p
Óscar López
  • 232,561
  • 37
  • 312
  • 386
  • 3
    @NiklasB. because we're removing elements from the list. If it's not absolutely necessary to remove elements, yes I agree with you: `[for p in x]` – Óscar López Apr 06 '12 at 19:34
  • Because it alters the list and if you just want to select half of the elements now and the other half later, you will have the remaining set later. – Henrik Apr 06 '12 at 19:34
  • @Henrik: Okay, that's why I asked you if you need the remaining list. You didn't answer that. – Niklas B. Apr 06 '12 at 19:35
5

I know this is an old question, but just for documentation's sake:

If you (the person googling the same question) are doing what I think you are doing, which is selecting k number of items randomly from a list (where k<=len(yourlist)), but making sure each item is never selected more than one time (=sampling without replacement), you could use random.sample like @j-f-sebastian suggests. But without knowing more about the use case, I don't know if this is what you need.

Dolf Andringa
  • 2,010
  • 1
  • 22
  • 36
3

One way to do it is:

x.remove(random.choice(x))
Simeon Visser
  • 118,920
  • 18
  • 185
  • 180
2

While not popping from the list, I encountered this question on Google while trying to get X random items from a list without duplicates. Here's what I eventually used:

items = [1, 2, 3, 4, 5]
items_needed = 2
from random import shuffle
shuffle(items)
for item in items[:items_needed]:
    print(item)

This may be slightly inefficient as you're shuffling an entire list but only using a small portion of it, but I'm not an optimisation expert so I could be wrong.

Noah McIlraith
  • 14,122
  • 7
  • 31
  • 35
1

This answer comes courtesy of @niklas-b:

"You probably want to use something like pypi.python.org/pypi/blist "

To quote the PYPI page:

...a list-like type with better asymptotic performance and similar performance on small lists

The blist is a drop-in replacement for the Python list that provides better performance when modifying large lists. The blist package also provides sortedlist, sortedset, weaksortedlist, weaksortedset, sorteddict, and btuple types.

One would assume lowered performance on the random access/random run end, as it is a "copy on write" data structure. This violates many use case assumptions on Python lists, so use it with care.

HOWEVER, if your main use case is to do something weird and unnatural with a list (as in the forced example given by @OP, or my Python 2.6 FIFO queue-with-pass-over issue), then this will fit the bill nicely.

Community
  • 1
  • 1
Mark Gerolimatos
  • 2,424
  • 1
  • 23
  • 33