0

I have a main list such as:

mainlst = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

and I want to search each item in this mainlst against multiple other search lists and if it's present in any of them to remove it from the main list, so for example:

searchlst1 = ['a', 'b', 'c']
searchlst2 = ['a', 'd', 'f']
searchlst3 = ['e', 'f', 'g']

The issue Im having is I cant work out how to make the loop go through each statement, so if I use and if elif statement it exits the loop as soon as it has found a match

for item in mainlst:
    if item in searchlst1:
        mainlst.remove(item)
    elif item in searchlst2:
        mainlst.remove(item)
    elif item in searchlst3
        mainlst.remove(item)

but obviously this exits the loop as soon as one condition is true, how do I make the loop go through all the conditions?

mdml
  • 22,442
  • 8
  • 58
  • 66
PaulBarr
  • 919
  • 6
  • 19
  • 33

3 Answers3

2

set objects are great for stuff like this -- the in operator takes O(1) time compared to O(N) time for a list -- And it's easy to create a set from a bunch of existing lists using set.union:

search_set = set().union(searchlst1, searchlst2, searchlst3)
mainlst = [x for x in mainlst if x not in search_set]

Example:

>>> search_set = set().union(searchlst1, searchlst2, searchlst3)
>>> search_set
set(['a', 'c', 'b', 'e', 'd', 'g', 'f'])
>>> mainlst = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> mainlst = [x for x in mainlst if x not in search_set]
>>> mainlst
['h']
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • Is there any advantage to using `set().union` as opposed to just constructing the set from the lists? – Tom Fenech May 01 '14 at 16:24
  • @TomFenech -- `lst1 + lst2 + lst3` involves first concatenating `lst1` and `lst2` and then concatenating that result with `lst3` -- It's a lot more copying of data from one temporary list to another. The way I did it only constructs a single (empty) intermediate set so there's a lot less waste. Practically, it really depends on the size of the lists, if this is in a tight loop, etc. – mgilson May 01 '14 at 19:41
1

How about using a list comprehension and a set:

[i for i in mainlst if i not in set(searchlst1 + searchlst2 + searchlst3)]

returns ['h']

set() takes an iterable (in this case a group of lists) and returns a set containing the unique values. Tests for membership in a set always take the same amount of time, whereas testing for membership in a list scales linearly with the length of the list.

The list comprehension goes through each element of mainlst and constructs a new list whose members are not in the set:

>>> mainlst = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> search = set(searchlst1 + searchlst2 + searchlst3)
>>> search
set(['a', 'c', 'b', 'e', 'd', 'g', 'f'])
>>> [i for i in mainlst if i not in search]
['h']
Tom Fenech
  • 72,334
  • 12
  • 107
  • 141
0

Replacing the elif statements with if statements will fix your problem.

for item in mainlst:
    if item in searchlst1:
        mainlst.remove(item)
    if item in searchlst2:
        mainlst.remove(item)
    if item in searchlst3:
        mainlst.remove(item)

The problem now is that your doing three searches through the list to remove items. This will become more time consuming as the list or searchlists grow. And in your example there are duplicates in your searchlists.

Combining the searchlists will reduce number of comparisons.

mdml
  • 22,442
  • 8
  • 58
  • 66
bob0the0mighty
  • 782
  • 11
  • 28