Numpy 2d set diff

Question

In numpy is it possible to make a difference between this 2 arrays:

[[0 0 0 0 1 1 1 1 2 2 2 2]
 [0 1 2 3 0 1 2 3 0 1 2 3]]


[[0 0 0 0 1 1 1 2 2 2]
 [0 1 2 3 0 2 3 0 1 2]]

to have this result

[[1 2]
 [1 3]]

?

from what I can tell, the result is the columns that are present in the first array, but missing from the second one. So a column wise equivalent of python's set subtraction — David L, Apr 27 '18 at 11:04
Yes, I tought it is obvious. Consider both arrays as, associations between 2 lists: (0,0), (0,1) ...(2,3) for the first array, (0,0),(0,1)..(2,2) for the secodn array. I want to find the difference between these associations, which is (1,2) and (1,3) — sergiuz, Apr 27 '18 at 11:05

jpp · Answer 1 · 2018-04-27T11:25:15.017

2

This is one way. You can also use numpy.unique for a similar solution (easier in v1.13+, see Find unique rows in numpy.array), but if performance is not an issue you can use set.

import numpy as np

A = np.array([[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2],
              [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]])

B = np.array([[0, 0, 0, 0, 1, 1, 1, 2, 2, 2],
              [0, 1, 2, 3, 0, 2, 3, 0, 1, 2]])

res = np.array(list(set(map(tuple, A.T)) - set(map(tuple, B.T)))).T

array([[2, 1],
       [3, 1]])

edited Apr 27 '18 at 11:25

answered Apr 27 '18 at 11:22

jpp

159,742
34
281
339

ah, I can't see how unique can help, since I have 2 arrays with different shapes. Or I am missing something? – sergiuz Apr 27 '18 at 11:29
@sergiuz, Try using `axis` argument or transposing the matrices via `A.T`, `B.T`. – jpp Apr 27 '18 at 11:31

score 1 · Answer 2 · answered Jul 21 '22 at 09:35

1

We can think 2D array as 2 times of 1D array and using numpy.setdiff1d to compare them.

answered Jul 21 '22 at 09:35

Tan Phan

337
1
4
14

score 0 · Answer 3 · answered Apr 27 '18 at 11:18

What about:

a=[[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2], [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]]
b=[[0, 0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 3, 0, 2, 3, 0, 1, 2]]
a = np.array(a).T
b = np.array(b).T
A = [tuple(t) for t in a]
B = [tuple(t) for t in b]
set(A)-set(B)
Out: {(1, 1), (2, 3)}

Numpy 2d set diff

3 Answers3