3

With a 2D masked array in Python, what would be the best way to get the index of the first and last rows and columns containing a non-masked value?

import numpy as np
a = np.reshape(range(30), (6,5))
amask = np.array([[True, True, False, True, True],
                  [True, False, False, True, True],
                  [True, True, True, False, True],
                  [True, False, False, False, True],
                  [True, True, True, False, True],
                  [True, True, True, True, True]])
a = np.ma.masked_array(a, amask)
print a
# [[-- -- 2 -- --]
#  [-- 6 7 -- --]
#  [-- -- -- 13 --]
#  [-- 16 17 18 --]
#  [-- -- -- 23 --]
#  [-- -- -- -- --]]

In this example, I would like to obtain:

  • (0, 4) for axis 0 (since the first row with unmasked value(s) is 0 and the last one is 4; the 6th row (row 5) only contains masked values)
  • (1, 3) for axis 1 (since the first column with unmasked value(s) is 1 and the last one is 3 (the 1st and 5th columns only contain masked values)).

[I thought about maybe combining numpy.ma.flatnotmasked_edges and numpy.apply_along_axis, without any success...]

ztl
  • 2,512
  • 1
  • 26
  • 40

2 Answers2

1

IIUC you can do:

d = amask==False #First know which array values are masked
rows,columns = np.where(d) #Get the positions of row and column of masked values

rows.sort() #sort the row values
columns.sort() #sort the column values

print('Row values :',(rows[0],rows[-1])) #print the first and last rows
print('Column values :',(columns[0],columns[-1])) #print the first and last columns

Row values : (0, 4)
Column values : (1, 3)

Or

rows, columns = np.nonzero(~a.mask)
print('Row values :',(rows.min(), rows.max())) #print the min and max rows
print('Column values :',(columns.min(), columns.max())) #print the min and max columns

Row values : (0, 4)
Column values : (1, 3)
Space Impact
  • 13,085
  • 23
  • 48
  • Thanks! IIUC, it would seems perhaps clearer and more straightfoward to me to do `rows, columns = np.nonzero(~a.mask)` and then `(rows.min(), rows.max())` and `(columns.min(), columns.max())`? I like the approach! – ztl Sep 27 '18 at 09:39
  • @ztl you can do that too. – Space Impact Sep 27 '18 at 09:40
1

Here's one based on argmax -

# Get mask for any data along axis=0,1 separately
m0 = a.all(axis=0)
m1 = a.all(axis=1)

# Use argmax to get first and last non-zero indices along axis=0,1 separately
axis0_out = m1.argmax(), a.shape[0] - m1[::-1].argmax() - 1
axis1_out = m0.argmax(), a.shape[1] - m0[::-1].argmax() - 1
Divakar
  • 218,885
  • 19
  • 262
  • 358