33

I would like to know what is the best way/efficient way to remove element(s) from the list.

There are few functions provided by Python:

  1. some_list.remove(value), but it throws error if value is not found.
  2. some_list.pop(some_list[index]), removes the item at the given position in the list, and return it.
  3. del (some_list[index]), it removes element from the given index, it's different from pop as it doesn't return value.

Scenarios:

  • If you have few items to remove say one element or between 1 to 5.
  • If you have to remove multiple items in a sequence.
  • If you have to remove different items based on a condition.
  • How about if you have a list of lists and want to remove elements in sequence.
Jens
  • 8,423
  • 9
  • 58
  • 78
user3247054
  • 431
  • 1
  • 5
  • 9

4 Answers4

33

My answer is not exactly to your question but after you read this, I hope you can decide which type you need to choose for your needs.

Python’s lists are variable-length arrays, not Lisp-style linked lists. The implementation uses a contiguous array of references to other objects, and keeps a pointer to this array.

This makes indexing a list a[i] an operation whose cost is independent of the size of the list or the value of the index.

When items are appended or inserted, the array of references is resized. Some algorithm is applied to improve the performance of appending items repeatedly; when the array must be grown, some extra space is allocated so the next few times don’t require an actual resize i.e over-allocation. More Information

Removing vs Pop vs Delete:

At first glance it looks like all of them are doing the same thing.

Under the hood its behaving different.

removing : remove an element from the list by iterating from 0 index till the first match of the element is found. taking more time to iterate if the element is at the end.

pop : removing element from the list by using the index. taking less time.

del : is a python statement that removes a name from a namespace, or an item from a dictionary, or an item from a list by using the index.

REMOVE:

  • it removes the first occurence of value.
  • raises ValueError if the value is not present.
  • it takes only one argument, so you can't remove multiple value in one shot.

POP:

  • remove and return item at index (default last).
  • Raises IndexError if list is empty or index is out of range.
  • it takes only one argument, so you can't remove multiple value in one shot.

DEL:

  • remove the item at index and return nothing.
  • it can remove slices from a list or can clear the whole list.

Benchmark:

Worst case : deleting from the end of the list.

yopy:-> python -m timeit "x=range(1000)" "x.pop(999)"
100000 loops, best of 3: 10 usec per loop
yopy:-> python -m timeit "x=range(1000)" "x.remove(999)"
10000 loops, best of 3: 31.3 usec per loop
yopy:-> python -m timeit "x=range(1000)" "del x[999]"
100000 loops, best of 3: 9.86 usec per loop
yopy:->

Best case: begining of the list.

yopy:-> python -m timeit "x=range(1000)" "x.remove(1)"
100000 loops, best of 3: 10.3 usec per loop
yopy:-> python -m timeit "x=range(1000)" "x.pop(1)"
100000 loops, best of 3: 10.4 usec per loop
yopy:-> python -m timeit "x=range(1000)" "del x[1]"
100000 loops, best of 3: 10.4 usec per loop
yopy:->

Point to be noted:

if array grows or shrinks in the middle

  • Realloc still depends on total length.
  • But, All the trailing elements have to be copied

So, now I hope you can decide what you need to choose for your needs.

James Sapam
  • 16,036
  • 12
  • 50
  • 73
  • 2
    Note that the above code snippets work only for Python 2.x, where [`range()`](https://docs.python.org/2/library/functions.html#range) produces an actual list of integers. For Python 3.x, the function returns a [`Range`](https://docs.python.org/3.6/library/stdtypes.html#typesseq-range) object which is different than an actual sequence. – Jens Sep 13 '17 at 00:12
21

Use a list comprehension:

Scenario 1:

[item for item in my_list if 1 <= item <=5 ]

Scenario 2:

to_be_removed = {'a', '1', 2}
[item for item in my_list if item not in to_be_removed ]

Scenario 3:

[item for item in my_list if some_condition()]

Scenario 4(Nested list comprehension):

[[item for item in seq if some_condition] for seq in my_list]

Note that if you want to remove just one item then list.remove, list.pop and del are definitely going to be very fast, but using these methods while iterating over the the list can result in unexpected output.

Related: Loop “Forgets” to Remove Some Items

Community
  • 1
  • 1
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • Thanks for your answer. So for only removing one item are list.remove, list.pop better than del (list[index]). list.remove can throw value not found error etc? – user3247054 Feb 02 '14 at 11:54
  • 1
    @user3247054 You can use try-except to handle such errors. `list.pop` and `del` are almost the same, the only difference being that pop returns the value as well. – Ashwini Chaudhary Feb 02 '14 at 11:57
  • thanks for your reply. What about filters mentioned below are they better than list comprehension? – user3247054 Feb 02 '14 at 11:59
  • @user3247054 Another problem with filter is that it returns an iterator in Python3, so an additional `list()` call will be required there. `map` and `filter` can beat LC by a whisker only when used with a built-in function, not lambda. – Ashwini Chaudhary Feb 02 '14 at 12:04
  • Thanks. Lastly when should one use filter then? – user3247054 Feb 02 '14 at 12:08
  • 2
    @user3247054 With built-in functions, for example: `filter(None, my_list)`, `filter(str.isupper, my_list)` etc, but in Python 3 you'll have to use an additional list call as it returns an iterator there. To get an iterator in Python 2 you can use `itertools.ifilter`. – Ashwini Chaudhary Feb 02 '14 at 12:13
7

Use filter instead of list comprehension:

Scenario 1:

filter(lambda item: 1 <= item <= 5, my_list)

Scenario 2:

to_be_removed = {'a', '1', 2}
filter(lambda item: item not in to_be_removed, my_list)

Scenario 3:

filter(lambda item: some_condition(), my_list)

Scenario 4(Nested filtered list):

filter(lambda seq: filter(lambda item: some_condition(), seq), my_list) 

For some reason, it's the same thing as a list comprhension, but it's quite clear that we are filtering things instead of generating them.

Loïc Faure-Lacroix
  • 13,220
  • 6
  • 67
  • 99
  • Thanks for your answer? Are filters better than list comprehension in terms of efficiency, less error prone(error throw by code) etc. – user3247054 Feb 02 '14 at 11:57
  • 1
    @user3247054 No, `filter` with lambda is actually slower than a LC. – Ashwini Chaudhary Feb 02 '14 at 11:59
  • 1
    I'd say that the filter is less error prone, because it cannot modify items in the lists. The list comprehension is more like a swiss knife. – Loïc Faure-Lacroix Feb 02 '14 at 11:59
  • So when should one use filter then? – user3247054 Feb 02 '14 at 12:01
  • Accepted Ashwin's answer as he answered first, was something i was looking for and gave more explanation to my questions. Thanks to both of you, for your time and help. – user3247054 Feb 02 '14 at 12:15
  • 1
    @user3247054 In my test, filter is slowlier by a margin of 2 using lambdas. Just mentioning that list comprehension can be replaced by "map/filter/reduce" sometimes. I guess the speed problem could be improved much if the python dev really cared about it. – Loïc Faure-Lacroix Feb 02 '14 at 12:18
  • Also note that [`filter()`](https://docs.python.org/3/library/functions.html#filter) for Python 3.x returns an iterable, not a list. – Jens Sep 13 '17 at 01:49
7

Good question, and James’ answer is the only one with actual performance data for Python 2.x for some of the suggested approaches. (See also my comment on that question.)

To complete the picture for Python 3.x, here are a few more tests. Because a single test may modify its list, we need N lists to modify for N tests; therefore I’ve created the set of lists before running a test.

# Python 3.6.2 (default, Jul 18 2017, 14:13:41) 
>>> import timeit
>>> number = 10000   # Number of tests.
>>> # Generate `number` lists of 1000 integer elements.
>>> setup = """
... lists=[[_ for _ in range(1000)] for _ in range(10000)]
... i = 0
... """
>>>

All tests, whether they modify a list in place of generate a new one, iterate over that set of lists, to ensure that the conditions for the tests are the same. For simplicity’s sake, all tests remove a single element from the middle of the list.

Let’s start with the examples from the question using the built-in list() functions:

# remove()
>>> stmt = """
... l = lists[i]     # Get the current work list.
... l.remove(500)    # Remove element.
... i += 1           # On to the next list.
... """
>>> timeit.timeit(stmt, setup=setup, number=number)
0.08474616194143891

# pop()
>>> stmt = "l = lists[i]; l.pop(500); i += 1"
>>> timeit.timeit(stmt, setup=setup, number=number)
0.01088976499158889

# index() and pop()
>>> stmt = "l = lists[i]; l.pop(l.index(500)); i += 1"
>>> timeit.timeit(stmt, setup=setup, number=number)
0.08841867197770625

# del
>>> stmt = "l = lists[i]; del l[500]; i += 1"
>>> timeit.timeit(stmt, setup=setup, number=number)
0.008702976978383958

# index() and del
>>> stmt = "l = lists[i]; del l[l.index(500)]; i += 1"
>>> timeit.timeit(stmt, setup=setup, number=number)
0.08238211390562356

List comprehensions as outlined in Ashwini Chaudhary’s answer:

>>> stmt = "l = lists[i]; [_ for _ in l if _ != 500]; i += 1"
>>> timeit.timeit(stmt, setup=setup, number=number)
0.44951551605481654

Using filter() as outlined in Loïc Faure-Lacroix’s answer. Note, however, that the examples in the above answer return a filter object for Python 3.x, and not a list like they do for Python 2.x!

# Generate a filter object.
>>> stmt = "l=lists[i]; filter(lambda _: _ != 500, l); i += 1"
>>> timeit.timeit(stmt, setup=setup, number=number)
0.0031418869039043784

# Generate a list from the filter object.
>>> stmt = "l=lists[i]; list(filter(lambda _: _ != 500, l)); i += 1"
>>> timeit.timeit(stmt, setup=setup, number=number)
1.1863253980409354

Removing an element that does not exist using Python’s built-in functions requires an additional test; the list comprehension and the filter solution handle non-existing list elements gracefully.

# Catch a resulting exception.
>>> stmt = """
... l = lists[i]
... try:
...     del l[l.index(1234)]
... except ValueError:
...     pass
... i += 1
... """
>>> timeit.timeit(stmt, setup=setup, number=number)
0.1451275929575786

# Test if the element exists, then delete.
>>> stmt = """
... l = lists[i]
... if 1234 in l:
...     del l[l.index[1234]]
... i += 1
... """
>>> timeit.timeit(stmt, setup=setup, number=number)
0.13344507792498916

I hope I got this right.

Asclepius
  • 57,944
  • 17
  • 167
  • 143
Jens
  • 8,423
  • 9
  • 58
  • 78
  • This is awesome. Thank you. `del` seems like the overwhelming winner here! – Akaisteph7 Mar 04 '20 at 03:41
  • I will like to point out that your test don't appropriately compare list comprehension with `del` statement. For `del` statement, the test is probably fine as you are removing an item from separate lists 1000 times(which will be similar to deleting 1000 times from one very long list, complexity wise), but in case of list comprehension, in your test, each time a new list is created, while in actual/real scenario of deleting 1000 items from a very long list, a new list would be needed to be created just once. – amsquareb May 05 '21 at 19:37
  • it seems kinda counter-intuitive that for removal of non-existent elements, the `if` conditional performs better (albeit slightly) than `try`. As by logic, it seems that in conditional, the checking is done twice: once by the conditional, other by actual removal... whereas in trial, it's checked once, & removed directly if exists. But i guess that it's raising the exception which consumes the time. Is there some way to do both these in single step? Or is it optimised in some way under the hood in `if` as well? – user8395964 May 29 '23 at 09:44
  • also, this seems to be a real good example of how to use `timeit` module's functions like `timeit()` or `repeat()` properly and their args like `stmt, setup, number` ; thanks for it :) – user8395964 May 29 '23 at 11:17