14

I have the following code which first selects elements of a NumPy array with a logical index mask:

import numpy as np

grid = np.random.rand(4,4) 
mask = grid > 0.5

I wish to use a second boolean mask against this one to pick out objects with :

masklength = len(grid[mask])
prob = 0.5
# generates an random array of bools
second_mask = np.random.rand(masklength) < prob 

# this fails to act on original object
grid[mask][second_mask] = 100

This is not quite the same problem as listed in this SO question: Numpy array, how to select indices satisfying multiple conditions? - as I am using random number generation, I don't want to have to generate a full mask, only for the elements selected by the first mask.

Community
  • 1
  • 1
Hemmer
  • 1,366
  • 1
  • 18
  • 33

5 Answers5

10

Using flat indexing avoids much of the headache:

grid.flat[np.flatnonzero(mask)[second_mask]] = 100

Breaking it down:

ind = np.flatnonzero(mask)

generates a flat array of indices where mask is true, which is then decimated further by applying second_mask:

ind = ind[second_mask] 

We could go on:

ind = ind[third_mask]

Finally

grid.flat[ind] = 100

indexes a flat version of grid with ind and assigns 100. grid.ravel()[ind] = 100 would also work, since ravel() returns a flat view into the original array.

Stefan
  • 4,380
  • 2
  • 30
  • 33
7

I believe the following does what you're asking:

grid[[a[second_mask] for a in np.where(mask)]] = 100

It works as follows:

  • np.where(mask) converts the boolean mask into the indices where mask is True;
  • [a[second_mask] for a in ...] subsets the indices to only select those where second_mask is True.

The reason your original version doesn't work is that grid[mask] involves fancy indexing. This creates a copy of the data, which in turn results in ...[second_mask] = 100 modifying that copy rather than the original array.

NPE
  • 486,780
  • 108
  • 951
  • 1,012
2

I came across this old thread while trying to do something similar. There are a lot of interesting answers here, but I believe I have come up with a method that is simpler and more intuitive than anything provided here: use the first mask on itself to change the True values to True or False as needed.

Here it is, one simple line and then the mask can be used as desired:

mask[mask] = second_mask
Tyson
  • 592
  • 4
  • 13
1

Another possible solution which I came up with after thinking about this a bit more is to have the second map retain the size of the first (which may or may not be worth the memory hit) and selectively add in the new elements:

#!/usr/bin/env python
import numpy as np

prob = 0.5    
grid = np.random.rand(4,4)

mask = grid > 0.5 
masklength = np.sum(mask)

# initialise with false map
second_mask = np.zeros((4,4), dtype=np.bool)
# then selectively add to this map using the second criteria
second_mask[mask] = np.random.rand(masklength) < prob

# this now acts on the original object
grid[second_mask] = 100

Though this is a bit longer, it seems to read better (to my beginner eyes), and in speed tests it performs in the same time.

Hemmer
  • 1,366
  • 1
  • 18
  • 33
-1
In [29]: ar = linspace(1,10,10)
In [30]: ar[(3<ar)*(ar<8)]
Out[30]: array([ 4.,  5.,  6.,  7.])
Adobe
  • 12,967
  • 10
  • 85
  • 126