-1

If I have a given 2d numpy array how can I efficiently make a mask of this array using 0s and 1s depending on where the values of this array are over a given threshold?

So far I made a working code that do this job like this:

import numpy as np

def maskedarray(data, threshold):

    #creating an array of zeros:
    zeros = np.zeros((np.shape(data)[0], np.shape(data)[1]))

    #going over each index of the data
    for i in range(np.shape(data)[0]):
        for j in range(np.shape(data)[1]):
            if data[i][j] > threshold:
                zeros[i][j] = 1

    return(zeros)

#creating a test array
test = np.random.rand(5,5)

#using the function above defined
mask = maskedarray(test,0.5)

I refuse myself to believe that there isn't a smarter way to do it without needing to use two nested FOR loops.

Thanks

Chicrala
  • 994
  • 12
  • 23

1 Answers1

2

The fastest way is simply:

def masked_array(data, threshold):
    return (data > threshold).astype(int)

Example:

data = np.random.random((5,5))
threshold = 0.5

>>> data
array([[0.42966975, 0.94785801, 0.31750045, 0.75944551, 0.05430315],
       [0.91475934, 0.65683185, 0.09019139, 0.85717157, 0.63074349],
       [0.33160746, 0.82455941, 0.50801804, 0.81087228, 0.01561161],
       [0.6932717 , 0.12741425, 0.17863726, 0.36682108, 0.95817187],
       [0.88320599, 0.51243802, 0.90219452, 0.78954102, 0.96708252]])    

>>> masked_array(data, threshold)
array([[0, 1, 0, 1, 0],
       [1, 1, 0, 1, 1],
       [0, 1, 1, 1, 0],
       [1, 0, 0, 0, 1],
       [1, 1, 1, 1, 1]])
sacuL
  • 49,704
  • 8
  • 81
  • 106
  • Can this method also be used to filter in between values? Like if we wanted the values to be filtered between 0.3 and 0.5 in your example? – Chicrala Dec 07 '18 at 16:59
  • 1
    Yeah: `((data > 0.3) & (data < 0.5)).astype(int)` creates your mask, and `data[mask]` filters it – sacuL Dec 07 '18 at 17:01