2

I'm trying to find the difference between the 2 arrays

arrayA = np.array(['A1', 'A2', 'A3'])
arrayB = np.array(['A1', 'A2', 'A3', 'A4', 'A5', 'A6'])

I'm trying to get

difference = ['A4', 'A5', 'A6']

How can I do this, thank you

  • 3
    If you do not need to consider duplicates you can use `set`: `difference = set(arrayB) - set(arrayA)` – Jeremy Apr 25 '22 at 13:36
  • What about `difference = [x for x in arrayB if x not in arrayA]` – Giovanni Tardini Apr 25 '22 at 13:36
  • 1
    Does this answer your question? [Python find elements in one list that are not in the other](https://stackoverflow.com/questions/41125909/python-find-elements-in-one-list-that-are-not-in-the-other) – Abhyuday Vaish Apr 25 '22 at 13:39

5 Answers5

5

Use numpy's setdiff:

np.setdiff1d(arrayA, arrayB)

Also - is there any special reason for which this needs to be a numpy array? You could simply use sets and then the minus operator: set(arrayA) - set(arrayB)

Gilad Green
  • 36,708
  • 7
  • 61
  • 95
  • 4
    Great answer! But do note that `np.setdiff1d(arrayA, arrayB)` returns an empty array. The arrays need to be swapped like so: `np.setdiff1d(arrayB, arrayA)`. – Red Apr 25 '22 at 13:54
1
[i for i in arrayB if i not in arrayA]
WSUN000
  • 167
  • 8
1

You can use the python set features for this:

import numpy as np
a = np.array(['A1', 'A2', 'A3'])
b = np.array(['A1', 'A2', 'A3', 'A4', 'A5', 'A6'])
print(set(b)-set(a))

Output:

{'A6', 'A5', 'A4'}

Or just comprehension:

import numpy as np
a = np.array(['A1', 'A2', 'A3'])
b = np.array(['A1', 'A2', 'A3', 'A4', 'A5', 'A6'])
print([i for i in b if i not in a])

Output:

['A4', 'A5', 'A6']
Anshumaan Mishra
  • 1,349
  • 1
  • 4
  • 19
1

As pointed out by this great answer, you can use the np.setdiff1d() method:

import numpy as np

arrayA = np.array(['A1', 'A2', 'A3'])
arrayB = np.array(['A1', 'A2', 'A3', 'A4', 'A5', 'A6'])

print(np.setdiff1d(arrayB, arrayA))

Output

['A4' 'A5' 'A6']

But the order of the elements will not be kept, as the result will always be sorted in ascending order. Observe:

import numpy as np


arrayA = np.array(['A1', 'A2', 'A3'])
arrayB = np.array(['A1', 'A2', 'A3', 'A4', 'A6', 'A5']) # Swapped 5 and 6

print(np.setdiff1d(arrayB, arrayA))

Output:

['A4' 'A5' 'A6']

If you want to keep the order, you can use the np.in1d() method:

import numpy as np

arrayA = np.array(['A1', 'A2', 'A3'])
arrayB = np.array(['A1', 'A2', 'A3', 'A4', 'A6', 'A5']) # Swapped 5 and 6

print(arrayB[~np.in1d(arrayB, arrayA)])

Output:

['A4' 'A6' 'A5']
Red
  • 26,798
  • 7
  • 36
  • 58
0

You can use sets:

difference = list(set(arrayB) - set(arrayA))

Output:

['A4', 'A6', 'A5']
Abhyuday Vaish
  • 2,357
  • 5
  • 11
  • 27