2

I am looking for a way in numpy to find the indices of outer specific rows in a 3d array. One example would be to find all occurrences of a given set of colours in a RBG image, and fetch the pixel coordinates.

This question shows that the in operator can behave weirdly with arrays, and this one lead closer but works for 2D arrays.

Let's say we have the 3d array Z with dimensions (x,y,z), and [s0, s1] the 3rd dimension rows we want to match.

Z = np.zeros((10,20,3), dtype=int)
s0 = np.array([1,2,3])
s1 = np.array([4,5,6])
Z[1,2] = s0
Z[4,5] = s1

I want all (x,y) where z is equal to either s0 or s1.

So far, argwhere return every match where one element from s0 is in Z:

> np.argwhere(s0 == Z) 
array([[1, 2, 0],
       [1, 2, 1],
       [1, 2, 2]])

in1d return a boolean 1D array with True where element in s0 or s1 match:

> np.in1d(Z, [s0,s1])

and if I try the raveled way:

> Zravel = np.ascontiguousarray(a).view([('', a.dtype)] * a.shape[-1]).ravel()
> np.all(np.in1d(Zravel, [s0, s1]) == False)

all element are False.

Any ideas?

Community
  • 1
  • 1
toine
  • 1,946
  • 18
  • 24

1 Answers1

2

np.in1d would flatten its inputs. So, you can feed it Z and a stacked version of s0, s1, giving us a boolean array that could be reshaped into an array of the same shape as Z. Then, you need to check for all TRUE rows in it for the matching indices. The implementation would look like this -

S = np.row_stack((s0,s1))
out = np.where((np.in1d(Z,S).reshape(Z.shape)).all(2))

You can also use broadcasting to solve it like so -

out = np.where(((Z == S[:,None,None,:]).all(3)).any(0))

If you would like the output stacked in an array -

outarr = np.column_stack((out))

For creating S, you can replace np.row_stack with np.concatenate, which might be faster, like so -

S = np.concatenate((s0,s1)).reshape(-1,s0.size)

Sample run -

In [145]: Z = np.zeros((10,20,3), dtype=int)
     ...: s0 = np.array([1,2,3])
     ...: s1 = np.array([4,5,6])
     ...: Z[1,2] = s0
     ...: Z[4,5] = s1
     ...: 

In [146]: np.where(((Z == S[:,None,None,:]).all(3)).any(0))
Out[146]: (array([1, 4]), array([2, 5]))

In [147]: np.where((np.in1d(Z,S).reshape(Z.shape)).all(2))
Out[147]: (array([1, 4]), array([2, 5]))

In [148]: np.column_stack((np.where(((Z == S[:,None,None,:]).all(3)).any(0))))
Out[148]: 
array([[1, 2],
       [4, 5]])

In [149]: np.column_stack((np.where((np.in1d(Z,S).reshape(Z.shape)).all(2))))
Out[149]: 
array([[1, 2],
       [4, 5]])
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • so the trick is to inflate the S object to dimensions that will match Z for broadcasting? – toine Oct 07 '15 at 18:20
  • @toine Yup! So how many such s0, s1.. would you have? If it's just two : s0 and s1, you won't need broadcasting at all! I just assumed a generic case to solve for i.e. S would store s0, s1, s2, s3 and so on. – Divakar Oct 07 '15 at 18:20
  • I'll have more as you guessed : ) so that solution looks good. – toine Oct 07 '15 at 18:22
  • @toine Awesome! So just add those in the first line. Also, you can replace `np.row_stack` with `np.concatenate` for squeezing out more performance. – Divakar Oct 07 '15 at 18:23