In python, let's say we have the following list:
my_list = ['a', 'b', 'c', 'd', ...]
Would my_list.pop(3)
be more efficient than my_list.remove('d')
?
In python, let's say we have the following list:
my_list = ['a', 'b', 'c', 'd', ...]
Would my_list.pop(3)
be more efficient than my_list.remove('d')
?
According to this answer, the complexity of these two methods is:
pop O(n - i)
remove O(n)
with the list length n
and index of the element i
.
So, depending on the size of your list, pop
might be faster on the long run.
It doesn't matter much for small lists:
[wander@box ~]$ python -m timeit '[1, 2, 3].pop(1)'
10000000 loops, best of 3: 0.167 usec per loop
[wander@box ~]$ python -m timeit '[1, 2, 3].remove(2)'
10000000 loops, best of 3: 0.129 usec per loop
If your lists are really large, or if it takes a long time to compare elements in your lists, there might be more of a difference. Remove will be slower, as it has to go through (and compare) all the elements:
[wander@box ~]$ python -m timeit -n 100000 'list(range(1, 100)).pop(98)'
100000 loops, best of 3: 0.916 usec per loop
[wander@box ~]$ python -m timeit -n 100000 'list(range(1, 100)).remove(98)'
100000 loops, best of 3: 2.05 usec per loop
And that's with integers, which are quite fast to compare. If the elements in your lists have more interesting __eq__
methods, remove
can take a long time:
class Foo:
def __eq__(self, other):
time.sleep(1)
return False
[Foo(), Foo(), Foo(), Foo(), 20].remove(20)
So, if you know the index, go with pop
.
Looking at the actual C code (you meant for CPython, right?) for listpop
and listremove
, you can see that:
.remove
needs to iterate over the list (and therefore scales by O(i)
where i
is the index of the item);
.pop
takes shortcuts, as:
it's trivial to figure out whether the index is out of bounds (line 928), but .remove
has to check through the whole list to find its target (or fail to do so); and
it special-cases the index being the last item in the list (line 933);
listremove
has a call to PyObject_RichCompareBool
(line 2200), which is relatively expensive, because it needs to check whether the object at the current index is equal to the item; and
both (except in the special case for .pop
mentioned above) ultimately delegate to list_ass_slice
(lines 941 and 2202 - and no giggling at the back) once the appropriate slicing location is found; this has to shuffle the items in the rest of the array along, so will be O(n - i)
.
On that basis .pop
will be faster, especially for the last item in the list; however, if you've started with the item itself and have already had a O(n)
operation with a rich comparison to find its .index
, what you gain on the swings you already lost on the roundabout.
Also, the operation of rearranging everything that's left in the array (i.e. list_ass_slice
) to make up for what you've removed, which both .pop
and .remove
need to do, is likely to be far more expensive than figuring out which item to remove in the first place.
Note that, without diving into the source code, pretty much all of the above could be figured out just by thinking logically about what each operation involves.