134

I have a large numpy array that I need to manipulate so that each element is changed to either a 1 or 0 if a condition is met (will be used as a pixel mask later). There are about 8 million elements in the array and my current method takes too long for the reduction pipeline:

for (y,x), value in numpy.ndenumerate(mask_data): 

    if mask_data[y,x]<3: #Good Pixel
        mask_data[y,x]=1
    elif mask_data[y,x]>3: #Bad Pixel
        mask_data[y,x]=0

Is there a numpy function that would speed this up?

ChrisFro
  • 2,723
  • 4
  • 15
  • 8

6 Answers6

179
>>> import numpy as np
>>> a = np.random.randint(0, 5, size=(5, 4))
>>> a
array([[4, 2, 1, 1],
       [3, 0, 1, 2],
       [2, 0, 1, 1],
       [4, 0, 2, 3],
       [0, 0, 0, 2]])
>>> b = a < 3
>>> b
array([[False,  True,  True,  True],
       [False,  True,  True,  True],
       [ True,  True,  True,  True],
       [False,  True,  True, False],
       [ True,  True,  True,  True]], dtype=bool)
>>> 
>>> c = b.astype(int)
>>> c
array([[0, 1, 1, 1],
       [0, 1, 1, 1],
       [1, 1, 1, 1],
       [0, 1, 1, 0],
       [1, 1, 1, 1]])

You can shorten this with:

>>> c = (a < 3).astype(int)
Steve Barnes
  • 27,618
  • 6
  • 63
  • 73
  • 2
    how to make this happen with specific columns without ever slicing out some columns and then assigning back again? for example, only elements in columns [2, 3] should change value when conditions met, while other columns will not change no matter conditions are met or not. – kuixiong Jul 20 '19 at 09:42
  • True, but only for the case of zeros and ones. See more general answer below (at efficiency cost) – borgr May 17 '20 at 13:29
134
>>> a = np.random.randint(0, 5, size=(5, 4))
>>> a
array([[0, 3, 3, 2],
       [4, 1, 1, 2],
       [3, 4, 2, 4],
       [2, 4, 3, 0],
       [1, 2, 3, 4]])
>>> 
>>> a[a > 3] = -101
>>> a
array([[   0,    3,    3,    2],
       [-101,    1,    1,    2],
       [   3, -101,    2, -101],
       [   2, -101,    3,    0],
       [   1,    2,    3, -101]])
>>>

See, eg, Indexing with boolean arrays.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
ev-br
  • 24,968
  • 9
  • 65
  • 78
  • 4
    great stuff, thanks! If you want to refer to the value you change you can use something like `a[a > 3] = -101+a[a > 3]`. – pexmar Jul 07 '17 at 17:12
  • 3
    @pexmar Though if you do `a[a > 3] = -101+a[a > 3]` instead of `a[a > 3] += -101` you will most likely face memory leakage. – Samuel Prevost Dec 14 '18 at 11:14
  • 2
    how do you refer to the value you change as pexmar asked?? – John Aug 16 '19 at 07:22
50

The quickest (and most flexible) way is to use np.where, which chooses between two arrays according to a mask(array of true and false values):

import numpy as np
a = np.random.randint(0, 5, size=(5, 4))
b = np.where(a<3,0,1)
print('a:',a)
print()
print('b:',b)

which will produce:

a: [[1 4 0 1]
 [1 3 2 4]
 [1 0 2 1]
 [3 1 0 0]
 [1 4 0 1]]

b: [[0 1 0 0]
 [0 1 0 1]
 [0 0 0 0]
 [1 0 0 0]
 [0 1 0 0]]
Markus Dutschke
  • 9,341
  • 4
  • 63
  • 58
  • 4
    what will be the best way if I don't want to replace with anything if condition is not met ?i.e. Only replace with the provide value when condition is met, if not leave the original number as it is.... – Abhishek Sengupta Jul 29 '20 at 10:56
  • 2
    to replace all values in a, which are smaller then 3 and keep the rest as it is, use `a[a<3] = 0` – Markus Dutschke Jul 29 '20 at 12:52
3

You can create your mask array in one step like this

mask_data = input_mask_data < 3

This creates a boolean array which can then be used as a pixel mask. Note that we haven't changed the input array (as in your code) but have created a new array to hold the mask data - I would recommend doing it this way.

>>> input_mask_data = np.random.randint(0, 5, (3, 4))
>>> input_mask_data
array([[1, 3, 4, 0],
       [4, 1, 2, 2],
       [1, 2, 3, 0]])
>>> mask_data = input_mask_data < 3
>>> mask_data
array([[ True, False, False,  True],
       [False,  True,  True,  True],
       [ True,  True, False,  True]], dtype=bool)
>>> 
YXD
  • 31,741
  • 15
  • 75
  • 115
  • 1
    Yep. If the OP really wants 0s and 1s, he could use `.astype(int)` or `*1`, but an array of `True` and `False` is just as good as it is. – DSM Nov 04 '13 at 11:43
3

I was a noob with Numpy, and the answers above where not straight to the point to modify in place my array, so I'm posting what I came up with:

import numpy as np

arr = np.array([[[10,20,30,255],[40,50,60,255]],
                [[70,80,90,255],[100,110,120,255]],
                [[170,180,190,255],[230,240,250,255]]])

# Change 1:
# Set every value to 0 if first element is smaller than 80 
arr[arr[:,:,0] < 80] = 0

print('Change 1:',arr,'\n')

# Change 2:
# Set every value to 1 if bigger than 180 and smaller than 240
# OR if equal to 170
arr[(arr > 180) & (arr < 240) | (arr == 170)] = 1

print('Change 2:',arr)

This produces:

Change 1: [[[  0   0   0   0]
  [  0   0   0   0]]

 [[  0   0   0   0]
  [100 110 120 255]]

 [[170 180 190 255]
  [230 240 250 255]]] 

Change 2: [[[  0   0   0   0]
  [  0   0   0   0]]

 [[  0   0   0   0]
  [100 110 120 255]]

 [[  1 180   1 255]
  [  1 240 250 255]]]

This way you can add tons of conditions like 'Change 2' and set values accordingly.

Gab ПК
  • 43
  • 4
-5

I am not sure I understood your question, but if you write:

mask_data[:3, :3] = 1
mask_data[3:, 3:] = 0

This will make all values of mask data whose x and y indexes are less than 3 to be equal to 1 and all rest to be equal to 0

mamalos
  • 97
  • 10