1

I am looking to remove the contents of one array from another.

array_2 = ['one' , "two" , "one", "three", "four"]
array_1 = ['one', "two"]

My first thought was to use list comprehensions

array_3 = [x for x in array_2 if x not in array_1]

However this will remove the duplicate item result : ['three', 'four']

I want to only remove "one" once from the array as I am looking for a list subtraction. So I want the result to be : ['one', 'three', 'four'].

What is a good pythonic way to achieve this ?

sentence
  • 8,213
  • 4
  • 31
  • 40
Steve
  • 4,388
  • 3
  • 17
  • 25
  • 2
    Possible duplicate of [Get difference between two lists](https://stackoverflow.com/questions/3462143/get-difference-between-two-lists) – RoadRunner May 01 '19 at 09:18
  • No the answers provided in that thread would remove all instances of 'one' I only want to delete an instance by instance as noted above – Steve May 01 '19 at 09:20
  • 2
    Yes they do. One the answers using `collections.Counter` does exactly that, which has already been repeated multiple times in the answers below. The answer: [stackoverflow.com/questions/3462143/get-difference-between-two-lists/42081195#42081195](https://stackoverflow.com/questions/3462143/get-difference-between-two-lists/42081195#42081195). – RoadRunner May 01 '19 at 09:21
  • 1
    apologies , that answer was buried deep in amongst the red herrings – Steve May 01 '19 at 09:24
  • Does this answer your question? [Python removing overlap of lists](https://stackoverflow.com/questions/51717984/python-removing-overlap-of-lists) – Georgy Jun 22 '20 at 13:45

5 Answers5

4

Try Counter from collections:

from collections import Counter

array_2 = ['one' , "two" , "one", "three", "four"]
array_1 = ['one', "two"]

list((Counter(array_2) - Counter(array_1)).elements())

output

['one', 'three', 'four']
sentence
  • 8,213
  • 4
  • 31
  • 40
  • This one probably works better than using `remove`, as it only requires to loop over both lists once. For big lists, that would make a big difference. However, it does not actually maintain the order in which all elements in `array_2` were. – 1313e May 01 '19 at 09:47
1

You could use the remove method of list:

array_2 = ['one' , "two" , "one", "three", "four"]
array_1 = ['one', "two"]

# copy list
array_3 = array_2[:]

for element in array_1:
    try:
        array_3.remove(element)
    except ValueError:
        pass
print(array_3)
# ['one', 'three', 'four']
Sparky05
  • 4,692
  • 1
  • 10
  • 27
  • Thats actually how I went about solving it. Why would there be a keyword error 28 do you know ? – Steve May 01 '19 at 09:21
  • 2
    You may want to check for a `ValueError` instead of a `KeyError`, as that is what is raised if the value is not in the list. – 1313e May 01 '19 at 09:44
0

The Counter object feels perfect for this.

In [1]: from collections import Counter                                                                                                                                      

In [2]: array_2 = ['one' , "two" , "one", "three", "four"]                                                                                                                   

In [3]: array_1 = ['one', "two"]                                                                                                                                             

In [4]: a2 = Counter(array_2)                                                                                                                                                

In [5]: a1 = Counter(array_1)                                                                                                                                                

In [6]: a2 - a1                                                                                                                                                              
Out[6]: Counter({'one': 1, 'three': 1, 'four': 1})

If you want a list, you can flatten the Counter using:

In [7]: list((a2-a1).elements())                                                                                                                                             
Out[7]: ['one', 'three', 'four']
ZaydH
  • 658
  • 6
  • 22
0

Using the map function in combination with a lambda would solve your task:

map(lambda x: array_2.remove(x) if x in array_2 else None, array_1)

This would change the array_2 directly and the result would be:

print(array_2)
# ['one', 'three', 'four']
Manu mathew
  • 859
  • 8
  • 25
0

I am just going to collect the excellent solutions already given above.

If you care about maintaining the original order in which the elements in array_2 were, then I think you have to use remove:

array_1 = ['one', 'two']
array_2 = ['one', 'two', 'one', 'three', 'four']
array_3 = list(array_2)
for x in array_1:
    try:
        array_3.remove(x)
    except ValueError:
        pass
print(array_3)

If it does not matter what the final order of elements is, then using Counter is much more efficient as it only loops over both lists once:

from collections import Counter

array_1 = ['one', 'two']
array_2 = ['one', 'two', 'one', 'three', 'four']
array_3 = list((Counter(array_2) - Counter(array_1)).elements())
print(array_3)
1313e
  • 1,112
  • 9
  • 17
  • for what it is worth I found that the answer from @Sparky05 proved to be the faster of the two on my dataset – Steve May 01 '19 at 10:02
  • 1
    @Steve Yes, for your example, using `remove` is faster, as initializing a `Counter` object can take a little while. However, if you were using two lists that have sizes over, let's say, 10k, then using `Counter` is much, much faster. – 1313e May 01 '19 at 10:04
  • 1
    @1313e Edit the code in the first block: `array_3 = list(array_2)` and `for x in array_1:` – sentence May 01 '19 at 10:18