6

problem is very simple: I have two 2d np.array and I want to get a third array that only contains the rows that are not in common with the latter twos.

for example:

X = np.array([[0,1],[1,2],[4,5],[5,6],[8,9],[9,10]])
Y = np.array([[5,6],[9,10]])

Z = function(X,Y)
Z = array([[0, 1],
          [1, 2],
          [4, 5],
          [8, 9]])

I tried np.delete(X,Y,axis=0) but it doesn't work...

rugrag
  • 163
  • 1
  • 1
  • 7

4 Answers4

2
Z = np.vstack(row for row in X if row not in Y)
Luchko
  • 1,123
  • 7
  • 15
1

The numpy_indexed package (disclaimer: I am its author) extends the standard numpy array set operations to multi-dimensional use cases such as these, with good efficiency:

import numpy_indexed as npi
Z = npi.difference(X, Y)
Eelco Hoogendoorn
  • 10,459
  • 1
  • 44
  • 42
0

Here's a views based approach -

# Based on http://stackoverflow.com/a/41417343/3293881 by @Eric
def setdiff2d(a, b):
    # check that casting to void will create equal size elements
    assert a.shape[1:] == b.shape[1:]
    assert a.dtype == b.dtype

    # compute dtypes
    void_dt = np.dtype((np.void, a.dtype.itemsize * np.prod(a.shape[1:])))
    orig_dt = np.dtype((a.dtype, a.shape[1:]))

    # convert to 1d void arrays
    a = np.ascontiguousarray(a)
    b = np.ascontiguousarray(b)
    a_void = a.reshape(a.shape[0], -1).view(void_dt)
    b_void = b.reshape(b.shape[0], -1).view(void_dt)

    # Get indices in a that are also in b
    return np.setdiff1d(a_void, b_void).view(orig_dt)

Sample run -

In [81]: X
Out[81]: 
array([[ 0,  1],
       [ 1,  2],
       [ 4,  5],
       [ 5,  6],
       [ 8,  9],
       [ 9, 10]])

In [82]: Y
Out[82]: 
array([[ 5,  6],
       [ 9, 10]])

In [83]: setdiff2d(X,Y)
Out[83]: 
array([[0, 1],
       [1, 2],
       [4, 5],
       [8, 9]])
Divakar
  • 218,885
  • 19
  • 262
  • 358
-1
Z = np.unique([tuple(row) for row in X + Y])
Reaper
  • 747
  • 1
  • 5
  • 15