205

Is it possible to delete multiple elements from a list at the same time? If I want to delete elements at index 0 and 2, and try something like del somelist[0], followed by del somelist[2], the second statement will actually delete somelist[3].

I suppose I could always delete the higher numbered elements first but I'm hoping there is a better way.

Løiten
  • 3,185
  • 4
  • 24
  • 36

32 Answers32

221

For some reason I don't like any of the answers here. Yes, they work, but strictly speaking most of them aren't deleting elements in a list, are they? (But making a copy and then replacing the original one with the edited copy).

Why not just delete the higher index first?

Is there a reason for this? I would just do:

for i in sorted(indices, reverse=True):
    del somelist[i]

If you really don't want to delete items backwards, then I guess you should just deincrement the indices values which are greater than the last deleted index (can't really use the same index since you're having a different list) or use a copy of the list (which wouldn't be 'deleting' but replacing the original with an edited copy).

Am I missing something here, any reason to NOT delete in the reverse order?

tglaria
  • 5,678
  • 2
  • 13
  • 17
  • 10
    There are two reasons. (a) For a list, the time complexity would be higher than the "make a copy" method (using a set of indices) on average (assuming random indices) because some elements need to be shifted forward multiple times. (b) At least for me, it's difficult to read, because there is a sort function that doesn't correspond to any actual program logic, and exists solely for technical reasons. Even though by now I already understand the logic thoroughly, I still _feel_ it would be difficult to read. – Imperishable Night Oct 27 '18 at 03:20
  • 1
    @ImperishableNightcould you elaborate (a) ? I don't understand the "some elements need to be shifted". For (b) you could just define a function if you need reading clarity. – tglaria Oct 29 '18 at 16:00
  • worth mentioning that you can also use `somelist.pop(i)` instead of `del` if you want to do something with the item you are removing. – user5359531 Sep 14 '20 at 20:07
  • @tglaria Lists occupy a contiguous amount of memory; therefore if you remove any element other than the last one, the elements "to the right" must be shifted "left". (I'm using 'left' as the start of the list, and 'right' as the end of the list here) – luizfls Apr 09 '21 at 22:42
  • @luizfls that's interesting. I wonder if that takes any resources. Wouldn't be easier to relocate the address to the next element of the list instead of shifting the elements of the list? – tglaria Jun 15 '21 at 19:39
  • 1
    @tglaria That is possible, and that is how [linked lists](https://en.wikipedia.org/wiki/Linked_list) are implemented. You lose the contiguity (single block) aspect of the "Python list", which gives you random access, i.e., the ability to access any element in constant time. In linked lists, the elements are scattered in memory and you store addresses to point to the next element. On the other hand, as you suggested, it is not necessary to shift elements after deletion. So it's a matter of tradeoff. – luizfls Jun 16 '21 at 00:24
141

You can use enumerate and remove the values whose index matches the indices you want to remove:

indices = 0, 2
somelist = [i for j, i in enumerate(somelist) if j not in indices]
Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143
SilentGhost
  • 307,395
  • 66
  • 306
  • 293
  • 3
    Almost, only if you delete the entire list. it'll be len(indices) * len(somelist). It also creates a copy, which may or may not be desired – Richard Levasseur Jan 31 '09 at 02:26
  • if you're checking for a value in a list, it is. the 'in' operator works on the values of a list, whereas it works on the keys of a dict. If i'm wrong, please point me to the pep/reference – Richard Levasseur Jan 31 '09 at 02:42
  • 5
    the reason that i've chosen tuple for indices was only simplicity of the record. it would be a perfect job for set() giving O(n) – SilentGhost Jan 31 '09 at 03:57
  • 35
    This is not deleting items from somelist at all, but rather creating a brand new list. If anything is holding a reference to the original list, it will still have all the items in it. – Tom Future Jul 31 '11 at 16:09
  • @SilentGhost I originally avoided using this solution, because "j not in indices" sounded slow, when many indices. As you acknowledge "set()" would fix that. Surprised you didn't edit your post. So, the performance fix is to replace `in indices` with `in set(indices)`? – ToolmakerSteve Dec 14 '13 at 22:14
  • @TomFuture. True. It is good practice, when creating a method to perform a task, to create a new object, rather than modifying the original. Leave it to the user to modify the original if they need to. Do so with `[:]`. E.g. `somelist[:] = ...` will replace the elements of the object pointed to by somelist, affecting all references to that object. – ToolmakerSteve Dec 14 '13 at 22:19
  • 2
    @SilentGhost Not necessary to make an enumeration. How about this: `somelist = [ lst[i] for i in xrange(len(lst)) if i not in set(indices) ]`? – ToolmakerSteve Dec 14 '13 at 22:33
  • (I've added an answer, far below, that shows this in action.) – ToolmakerSteve Dec 14 '13 at 22:46
  • 3
    Changed my mind. By giving the results of the enumerate more descriptive names, this approach is easy to read. It also helps me if parens are added. That is: `[ value for (i, value) in enumerate(lst) if i not in set(indices) ]`. – ToolmakerSteve Dec 15 '13 at 00:02
  • Brilliant! this method really help me – Aditya Kresna Permana Sep 20 '15 at 06:09
  • I like better having it indices = {0, 2} to make indices a set from the get-go :) – amohr Feb 07 '17 at 23:34
  • As others have said, this is NOT deleting anything from a list – it's creating a new list with those elements excluded. For some applications both ways are fine, but the OP specifically mentions deleting and del. – Arru Feb 22 '19 at 14:48
130

If you're deleting multiple non-adjacent items, then what you describe is the best way (and yes, be sure to start from the highest index).

If your items are adjacent, you can use the slice assignment syntax:

a[2:10] = []
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
36

You can use numpy.delete as follows:

import numpy as np
a = ['a', 'l', 3.14, 42, 'u']
I = [0, 2]
np.delete(a, I).tolist()
# Returns: ['l', '42', 'u']

If you don't mind ending up with a numpy array at the end, you can leave out the .tolist(). You should see some pretty major speed improvements, too, making this a more scalable solution. I haven't benchmarked it, but numpy operations are compiled code written in either C or Fortran.

philE
  • 1,655
  • 15
  • 20
18

As a specialisation of Greg's answer, you can even use extended slice syntax. eg. If you wanted to delete items 0 and 2:

>>> a= [0, 1, 2, 3, 4]
>>> del a[0:3:2]
>>> a
[1, 3, 4]

This doesn't cover any arbitrary selection, of course, but it can certainly work for deleting any two items.

bobince
  • 528,062
  • 107
  • 651
  • 834
16

As a function:

def multi_delete(list_, *args):
    indexes = sorted(list(args), reverse=True)
    for index in indexes:
        del list_[index]
    return list_

Runs in n log(n) time, which should make it the fastest correct solution yet.

Nikhil
  • 5,705
  • 1
  • 32
  • 30
  • 1
    The version with args.sort().reverse() is definitely better. It also happens to work with dicts instead of throwing or, worse, silently corrupting. –  Jan 30 '09 at 22:45
  • sort() is not defined for tuple, you'd have to convert to list first. sort() returns None, so you couldn't use reverse() on it. – SilentGhost Jan 31 '09 at 01:38
  • @ R. Pate: I removed the first version for that reason. Thanks. @ SilentGhost: Fixed it. – Nikhil Jan 31 '09 at 01:47
  • @Nikhil: no you didn't ;) args = list(args) args.sort() args.reverse() but better option would be: args = sorted(args, reverse=True) – SilentGhost Jan 31 '09 at 01:53
  • Oh, I see what you meant - you can't chain the methods, which seems silly, even for immutable objects. I don't like changing a variable's type as in "args = list(args)" . I know it's done, but it can be confusing. – Nikhil Jan 31 '09 at 02:28
  • The list argument overrides the builtin list used inside the method. – iny Jan 31 '09 at 13:50
  • Yeah, I'd give this solution a +1 if it actually worked as written (c.f. what @iny said). – Carl Meyer Jan 31 '09 at 15:59
  • why did you do `list(args)`, couldn't you use `args` as is? – João Portela Oct 11 '10 at 21:55
  • @JoãoPortela is correct. After the first edit, which put `list(args)` inside of `sorted()`, the `list()` wrapper is no longer necessary, as per SilentGhost's responding comment. Tested & verified with tuple input. Have submitted as an edit. – ToolmakerSteve Dec 14 '13 at 20:33
  • 5
    `n log n`? Really? I don't think `del list[index]` is O(1). – user202729 Jun 15 '18 at 02:56
  • @user202729 del` statement itself is actually O(1), and the cleanup of the old array is delayed. So, the algorithmic cost may still be O(n). https://stackoverflow.com/questions/44148504/python-big-o-of-del-my-list – ARAT Sep 03 '22 at 15:48
  • @ARAT That answer contradict what you says. It has "it could be optimized", but it's not actually optimized yet. Besides, that's a different case from this one, here only one element is deleted all the following elements must be shifted to the left. – user202729 Sep 04 '22 at 16:22
  • My mistake, sorry! I was just referring that `del` statement itself is actually O(1), Other than that, you are totally right. As you said, since an item is removed, all the following elements are shifted, that makes `del list[index]` take O(n) time. – ARAT Sep 04 '22 at 17:51
12

So, you essentially want to delete multiple elements in one pass? In that case, the position of the next element to delete will be offset by however many were deleted previously.

Our goal is to delete all the vowels, which are precomputed to be indices 1, 4, and 7. Note that its important the to_delete indices are in ascending order, otherwise it won't work.

to_delete = [1, 4, 7]
target = list("hello world")
for offset, index in enumerate(to_delete):
  index -= offset
  del target[index]

It'd be a more complicated if you wanted to delete the elements in any order. IMO, sorting to_delete might be easier than figuring out when you should or shouldn't subtract from index.

Richard Levasseur
  • 14,562
  • 6
  • 50
  • 63
9

I'm a total beginner in Python, and my programming at the moment is crude and dirty to say the least, but my solution was to use a combination of the basic commands I learnt in early tutorials:

some_list = [1,2,3,4,5,6,7,8,10]
rem = [0,5,7]

for i in rem:
    some_list[i] = '!' # mark for deletion

for i in range(0, some_list.count('!')):
    some_list.remove('!') # remove
print some_list

Obviously, because of having to choose a "mark-for-deletion" character, this has its limitations.

As for the performance as the size of the list scales, I'm sure that my solution is sub-optimal. However, it's straightforward, which I hope appeals to other beginners, and will work in simple cases where some_list is of a well-known format, e.g., always numeric...

Gulzar
  • 23,452
  • 27
  • 113
  • 201
Paul
  • 123
  • 1
  • 1
  • 4
    instead of using '!' as your special character, use None. This keeps every character valid and frees up your possiblities – benathon Nov 14 '15 at 00:22
6

Here is an alternative, that does not use enumerate() to create tuples (as in SilentGhost's original answer).

This seems more readable to me. (Maybe I'd feel differently if I was in the habit of using enumerate.) CAVEAT: I have not tested performance of the two approaches.

# Returns a new list. "lst" is not modified.
def delete_by_indices(lst, indices):
    indices_as_set = set(indices)
    return [ lst[i] for i in xrange(len(lst)) if i not in indices_as_set ]

NOTE: Python 2.7 syntax. For Python 3, xrange => range.

Usage:

lst = [ 11*x for x in xrange(10) ]
somelist = delete_by_indices( lst, [0, 4, 5])

somelist:

[11, 22, 33, 66, 77, 88, 99]

--- BONUS ---

Delete multiple values from a list. That is, we have the values we want to delete:

# Returns a new list. "lst" is not modified.
def delete__by_values(lst, values):
    values_as_set = set(values)
    return [ x for x in lst if x not in values_as_set ]

Usage:

somelist = delete__by_values( lst, [0, 44, 55] )

somelist:

[11, 22, 33, 66, 77, 88, 99]

This is the same answer as before, but this time we supplied the VALUES to be deleted [0, 44, 55].

ToolmakerSteve
  • 18,547
  • 14
  • 94
  • 196
  • I've decided @SilentGhost's was only difficult to read, because of the non-descriptive variable names used for the result of the enumerate. Also, parens would have made it easier to read. So here is how I would word his solution (with "set" added, for performance): `[ value for (i, value) in enumerate(lst) if i not in set(indices) ]`. But I'll leave my answer here, because I also show how to delete by values. Which is an easier case, but might help someone. – ToolmakerSteve Dec 15 '13 at 00:05
  • @Veedrac- thank you; I've re-written to build the set first. What do you think - faster solution now than SilentGhost's? (I don't consider it important enough to actually time it, just asking your opinion.) Similarly, I would re-write SilentGhost's version as `indices_as_set = set(indices)`, `[ value for (i, value) in enumerate(lst) if i not in indices_as_set ]`, to speed it up. – ToolmakerSteve Nov 07 '14 at 01:42
  • 1
    Is there a style reason for the double underscore in `delete__by_values()`? – Tom May 25 '15 at 19:40
5

An alternative list comprehension method that uses list index values:

stuff = ['a', 'b', 'c', 'd', 'e', 'f', 'woof']
index = [0, 3, 6]
new = [i for i in stuff if stuff.index(i) not in index]

This returns:

['b', 'c', 'e', 'f']
Meow
  • 1,207
  • 15
  • 23
4

This has been mentioned, but somehow nobody managed to actually get it right.

On O(n) solution would be:

indices = {0, 2}
somelist = [i for j, i in enumerate(somelist) if j not in indices]

This is really close to SilentGhost's version, but adds two braces.

Community
  • 1
  • 1
Veedrac
  • 58,273
  • 15
  • 112
  • 169
4
l = ['a','b','a','c','a','d']
to_remove = [1, 3]
[l[i] for i in range(0, len(l)) if i not in to_remove])

It's basically the same as the top voted answer, just a different way of writing it. Note that using l.index() is not a good idea, because it can't handle duplicated elements in a list.

zinc
  • 139
  • 1
  • 3
4

here is another method which removes the elements in place. also if your list is really long, it is faster.

>>> a = range(10)
>>> remove = [0,4,5]
>>> from collections import deque
>>> deque((list.pop(a, i) for i in sorted(remove, reverse=True)), maxlen=0)

>>> timeit.timeit('[i for j, i in enumerate(a) if j not in remove]', setup='import random;remove=[random.randrange(100000) for i in range(100)]; a = range(100000)', number=1)
0.1704120635986328

>>> timeit.timeit('deque((list.pop(a, i) for i in sorted(remove, reverse=True)), maxlen=0)', setup='from collections import deque;import random;remove=[random.randrange(100000) for i in range(100)]; a = range(100000)', number=1)
0.004853963851928711
user545424
  • 15,713
  • 11
  • 56
  • 70
  • +1: Interesting use of deque to perform a for action as part of an expression, rather than requiring a "for ..:" block. However, for this simple case, I find Nikhil's for block more readable. – ToolmakerSteve Dec 14 '13 at 20:43
2

Remove method will causes a lot of shift of list elements. I think is better to make a copy:

...
new_list = []
for el in obj.my_list:
   if condition_is_true(el):
      new_list.append(el)
del obj.my_list
obj.my_list = new_list
...
luca
  • 21
  • 1
2

technically, the answer is NO it is not possible to delete two objects AT THE SAME TIME. However, it IS possible to delete two objects in one line of beautiful python.

del (foo['bar'],foo['baz'])

will recusrively delete foo['bar'], then foo['baz']

  • This deletes from a dict object, not a list, but I'm still +1'ing it cause it's darn pretty! – Ulf Aslak Feb 12 '16 at 17:33
  • It applies to list as well, with appropriate syntax. However the claim is that it is not possible to delete two objects at the same time is false; see answer by @bobince – Pedro Gimeno Oct 25 '19 at 18:44
2

we can do this by use of a for loop iterating over the indexes after sorting the index list in descending order

mylist=[66.25, 333, 1, 4, 6, 7, 8, 56, 8769, 65]
indexes = 4,6
indexes = sorted(indexes, reverse=True)
for i in index:
    mylist.pop(i)
print mylist
Gourav Singla
  • 1,728
  • 1
  • 15
  • 22
2

For the indices 0 and 2 from listA:

for x in (2,0): listA.pop(x)

For some random indices to remove from listA:

indices=(5,3,2,7,0) 
for x in sorted(indices)[::-1]: listA.pop(x)
graviton
  • 85
  • 1
  • 3
2

I wanted to a way to compare the different solutions that made it easy to turn the knobs.

First I generated my data:

import random

N = 16 * 1024
x = range(N)
random.shuffle(x)
y = random.sample(range(N), N / 10)

Then I defined my functions:

def list_set(value_list, index_list):
    index_list = set(index_list)
    result = [value for index, value in enumerate(value_list) if index not in index_list]
    return result

def list_del(value_list, index_list):
    for index in sorted(index_list, reverse=True):
        del(value_list[index])

def list_pop(value_list, index_list):
    for index in sorted(index_list, reverse=True):
        value_list.pop(index)

Then I used timeit to compare the solutions:

import timeit
from collections import OrderedDict

M = 1000
setup = 'from __main__ import x, y, list_set, list_del, list_pop'
statement_dict = OrderedDict([
    ('overhead',  'a = x[:]'),
    ('set', 'a = x[:]; list_set(a, y)'),
    ('del', 'a = x[:]; list_del(a, y)'),
    ('pop', 'a = x[:]; list_pop(a, y)'),
])

overhead = None
result_dict = OrderedDict()
for name, statement in statement_dict.iteritems():
    result = timeit.timeit(statement, number=M, setup=setup)
    if overhead is None:
        overhead = result
    else:
        result = result - overhead
        result_dict[name] = result

for name, result in result_dict.iteritems():
    print "%s = %7.3f" % (name, result)

Output

set =   1.711
del =   3.450
pop =   3.618

So the generator with the indices in a set was the winner. And del is slightly faster then pop.

  • Thank you for this comparison, this led me to make my own tests (actually just borrowed your code) and for small number of items to remove, the overhead for creating a SET makes it the worst solution (use 10, 100, 500 for the length of 'y' and you'll see). As most of the times, this depends on the application. – tglaria Oct 10 '17 at 15:38
2

You can use this logic:

my_list = ['word','yes','no','nice']

c=[b for i,b in enumerate(my_list) if not i in (0,2,3)]

print c
Neodan
  • 5,154
  • 2
  • 27
  • 38
raghu
  • 384
  • 7
  • 10
2

Another implementation of the idea of removing from the highest index.

for i in range(len(yourlist)-1, -1, -1):
    del yourlist(i)
ipramusinto
  • 2,310
  • 2
  • 14
  • 24
2

You may want to simply use np.delete:

list_indices = [0, 2]
original_list = [0, 1, 2, 3]
new_list = np.delete(original_list, list_indices)

Output

array([1, 3])

Here, the first argument is the original list, the second is the index or a list of indices you want to delete.

There is a third argument which you can use in the case of having ndarrays: axis (0 for rows and 1 for columns in case of ndarrays).

jvel07
  • 1,239
  • 11
  • 15
1

To generalize the comment from @sth. Item deletion in any class, that implements abc.MutableSequence, and in list in particular, is done via __delitem__ magic method. This method works similar to __getitem__, meaning it can accept either an integer or a slice. Here is an example:

class MyList(list):
    def __delitem__(self, item):
        if isinstance(item, slice):
            for i in range(*item.indices(len(self))):
                self[i] = 'null'
        else:
            self[item] = 'null'


l = MyList(range(10))
print(l)
del l[5:8]
print(l)

This will output

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4, 'null', 'null', 'null', 8, 9]
Community
  • 1
  • 1
Alexander Zhukov
  • 4,357
  • 1
  • 20
  • 31
1

Importing it only for this reason might be overkill, but if you happen to be using pandas anyway, then the solution is simple and straightforward:

import pandas as pd
stuff = pd.Series(['a','b','a','c','a','d'])
less_stuff = stuff[stuff != 'a']  # define any condition here
# results ['b','c','d']
Lorinc Nyitrai
  • 968
  • 1
  • 10
  • 27
1

You can do that way on a dict, not on a list. In a list elements are in sequence. In a dict they depend only on the index.

Simple code just to explain it by doing:

>>> lst = ['a','b','c']
>>> dct = {0: 'a', 1: 'b', 2:'c'}
>>> lst[0]
'a'
>>> dct[0]
'a'
>>> del lst[0]
>>> del dct[0]
>>> lst[0]
'b'
>>> dct[0]
Traceback (most recent call last):
  File "<pyshell#19>", line 1, in <module>
    dct[0]
KeyError: 0
>>> dct[1]
'b'
>>> lst[1]
'c'

A way to "convert" a list in a dict is:

>>> dct = {}
>>> for i in xrange(0,len(lst)): dct[i] = lst[i]

The inverse is:

lst = [dct[i] for i in sorted(dct.keys())] 

Anyway I think it's better to start deleting from the higher index as you said.

Andrea Ambu
  • 38,188
  • 14
  • 54
  • 77
  • Does Python guarantee [dct[i] for i in dct] will always use increasing values of i? If so, list(dct.values()) is surely better. –  Jan 30 '09 at 22:41
  • I was not thinking about that. You're right. There is not guarantee as I read [here][1] that the items will be picked in order, or at least the expected order. I edited. [1]: http://docs.python.org/library/stdtypes.html#dict.items – Andrea Ambu Jan 31 '09 at 13:07
  • 2
    This answer talks about dictionaries in a fundamentally wrong way. A dictionary has KEYS (not INDICES). Yes, the key/value pairs are independent of each other. No, it doesn't matter what order you delete the entries in. Converting to a dictionary just to delete some elements from a list would be overkill. – ToolmakerSteve Dec 14 '13 at 20:15
1

I can actually think of two ways to do it:

  1. slice the list like (this deletes the 1st,3rd and 8th elements)

    somelist = somelist[1:2]+somelist[3:7]+somelist[8:]

  2. do that in place, but one at a time:

    somelist.pop(2) somelist.pop(0)

Bartosz Radaczyński
  • 18,396
  • 14
  • 54
  • 61
1
some_list.remove(some_list[max(i, j)])

Avoids sorting cost and having to explicitly copy list.

Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
Chester
  • 11
  • 1
1

None of the answers offered so far performs the deletion in place in O(n) on the length of the list for an arbitrary number of indices to delete, so here's my version:

def multi_delete(the_list, indices):
    assert type(indices) in {set, frozenset}, "indices must be a set or frozenset"
    offset = 0
    for i in range(len(the_list)):
        if i in indices:
            offset += 1
        elif offset:
            the_list[i - offset] = the_list[i]
    if offset:
        del the_list[-offset:]

# Example:
a = [0, 1, 2, 3, 4, 5, 6, 7]
multi_delete(a, {1, 2, 4, 6, 7})
print(a)  # prints [0, 3, 5]
Pedro Gimeno
  • 2,837
  • 1
  • 25
  • 33
1

I tested the suggested solutions with perfplot and found that NumPy's

np.delete(lst, remove_ids)

is the fastest solution if the list is longer than about 100 entries. Before that, all solutions are around 10^-5 seconds. The list comprehension seems simple enough then:

out = [item for i, item in enumerate(lst) if i not in remove_ids]

enter image description here


Code to reproduce the plot:

import perfplot
import random
import numpy as np
import copy


def setup(n):
    lst = list(range(n))
    random.shuffle(lst)
    # //10 = 10%
    remove_ids = random.sample(range(n), n // 10)
    return lst, remove_ids


def if_comprehension(lst, remove_ids):
    return [item for i, item in enumerate(lst) if i not in remove_ids]


def del_list_inplace(lst, remove_ids):
    out = copy.deepcopy(lst)
    for i in sorted(remove_ids, reverse=True):
        del out[i]
    return out


def del_list_numpy(lst, remove_ids):
    return np.delete(lst, remove_ids)


b = perfplot.bench(
    setup=setup,
    kernels=[if_comprehension, del_list_numpy, del_list_inplace],
    n_range=[2**k for k in range(20)],
)
b.save("out.png")
b.show()
Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
0

How about one of these (I'm very new to Python, but they seem ok):

ocean_basin = ['a', 'Atlantic', 'Pacific', 'Indian', 'a', 'a', 'a']
for i in range(1, (ocean_basin.count('a') + 1)):
    ocean_basin.remove('a')
print(ocean_basin)

['Atlantic', 'Pacific', 'Indian']

ob = ['a', 'b', 4, 5,'Atlantic', 'Pacific', 'Indian', 'a', 'a', 4, 'a']
remove = ('a', 'b', 4, 5)
ob = [i for i in ob if i not in (remove)]
print(ob)

['Atlantic', 'Pacific', 'Indian']

0

I put it all together into a list_diff function that simply takes two lists as inputs and returns their difference, while preserving the original order of the first list.

def list_diff(list_a, list_b, verbose=False):

    # returns a difference of list_a and list_b,
    # preserving the original order, unlike set-based solutions

    # get indices of elements to be excluded from list_a
    excl_ind = [i for i, x in enumerate(list_a) if x in list_b]
    if verbose:
        print(excl_ind)

    # filter out the excluded indices, producing a new list 
    new_list = [i for i in list_a if list_a.index(i) not in excl_ind]
    if verbose:
        print(new_list)

    return(new_list)

Sample usage:

my_list = ['a', 'b', 'c', 'd', 'e', 'f', 'woof']
# index = [0, 3, 6]

# define excluded names list
excl_names_list = ['woof', 'c']

list_diff(my_list, excl_names_list)
>> ['a', 'b', 'd', 'e', 'f']
mirekphd
  • 4,799
  • 3
  • 38
  • 59
0

Use numpy.delete which is definitely faster (376 times, as shown later) than python lists.

First method (using numpy):

import numpy as np

arr = np.array([0,3,5,7])
# [0,3,5,7]
indexes = [0,3]
np.delete(arr, indexes)
# [3,5]

Second method (using a python list):

arr = [0,3,5,7]
# [0,3,5,7]
indexes = [0,3]
for index in sorted(indexes, reverse=True):
    del arr[index]
arr
# [3,5]

Code to benchmark the two methods on an array of 500000 elements, deleting half of the elements randomly:

import numpy as np
import random
import time

start = 0
stop = 500000
elements = np.arange(start,stop)
num_elements = len(temp)

temp = np.copy(elements)
temp2 = elements.tolist()

indexes = random.sample(range(0, num_elements), int(num_elements/2))

start_time = time.time()

temp = np.delete(temp, indexes)

end_time = time.time()
total_time = end_time - start_time
print("First method: ", total_time)

start_time = time.time()

for index in sorted(indexes, reverse=True):
    del temp2[index]

end_time = time.time()
total_time = end_time - start_time
print("Second method: ", total_time)

# First method:  0.04500985145568848
# Second method:  16.94180393218994

The first method is about 376 times faster than the second one.

Rosario Scavo
  • 466
  • 4
  • 7
-1

You can use remove, too.

delete_from_somelist = []
for i in [int(0), int(2)]:
     delete_from_somelist.append(somelist[i])
for j in delete_from_somelist:
     newlist = somelist.remove(j)