Using multiple levels of boolean index mask in NumPy

Question

I have the following code which first selects elements of a NumPy array with a logical index mask:

import numpy as np

grid = np.random.rand(4,4) 
mask = grid > 0.5

I wish to use a second boolean mask against this one to pick out objects with :

masklength = len(grid[mask])
prob = 0.5
# generates an random array of bools
second_mask = np.random.rand(masklength) < prob 

# this fails to act on original object
grid[mask][second_mask] = 100

This is not quite the same problem as listed in this SO question: Numpy array, how to select indices satisfying multiple conditions? - as I am using random number generation, I don't want to have to generate a full mask, only for the elements selected by the first mask.

score 10 · Answer 1 · answered Jun 11 '13 at 17:55

Using flat indexing avoids much of the headache:

grid.flat[np.flatnonzero(mask)[second_mask]] = 100

Breaking it down:

ind = np.flatnonzero(mask)

generates a flat array of indices where mask is true, which is then decimated further by applying second_mask:

ind = ind[second_mask]

We could go on:

ind = ind[third_mask]

Finally

grid.flat[ind] = 100

indexes a flat version of grid with ind and assigns 100. grid.ravel()[ind] = 100 would also work, since ravel() returns a flat view into the original array.

NPE · Accepted Answer · 2011-08-24T19:50:03.877

7

I believe the following does what you're asking:

grid[[a[second_mask] for a in np.where(mask)]] = 100

It works as follows:

np.where(mask) converts the boolean mask into the indices where mask is True;
[a[second_mask] for a in ...] subsets the indices to only select those where second_mask is True.

The reason your original version doesn't work is that grid[mask] involves fancy indexing. This creates a copy of the data, which in turn results in ...[second_mask] = 100 modifying that copy rather than the original array.

edited Aug 24 '11 at 19:50

answered Aug 24 '11 at 17:26

NPE

486,780
108
951
1,012

Perfect, just what I was looking for. – Hemmer Aug 24 '11 at 17:46
Also is there any copying of arrays involved in the snippet you posted? – Hemmer Aug 24 '11 at 17:49
1

@Hemmer: There are new arrays created by `np.where` and `a[second_mask]`. The size of those arrays depends on the number of True elements in `mask` and `second_mask` and is independent of the size of `grid`. – NPE Aug 24 '11 at 17:53
I have posted an alternative solution that I found in case you are interested. – Hemmer Aug 25 '11 at 14:28
just combine masks by `grid[mask & second_mask]`. – MasterControlProgram Nov 15 '16 at 19:48

Tyson · Answer 3 · 2022-07-22T07:03:25.820

2

I came across this old thread while trying to do something similar. There are a lot of interesting answers here, but I believe I have come up with a method that is simpler and more intuitive than anything provided here: use the first mask on itself to change the True values to True or False as needed.

Here it is, one simple line and then the mask can be used as desired:

mask[mask] = second_mask

edited Jul 22 '22 at 07:03

answered Jul 22 '22 at 06:53

Tyson

592
4
13

score 1 · Answer 4 · answered Aug 25 '11 at 14:28

Another possible solution which I came up with after thinking about this a bit more is to have the second map retain the size of the first (which may or may not be worth the memory hit) and selectively add in the new elements:

#!/usr/bin/env python
import numpy as np

prob = 0.5    
grid = np.random.rand(4,4)

mask = grid > 0.5 
masklength = np.sum(mask)

# initialise with false map
second_mask = np.zeros((4,4), dtype=np.bool)
# then selectively add to this map using the second criteria
second_mask[mask] = np.random.rand(masklength) < prob

# this now acts on the original object
grid[second_mask] = 100

Though this is a bit longer, it seems to read better (to my beginner eyes), and in speed tests it performs in the same time.

score -1 · Answer 5 · answered Mar 29 '14 at 11:52

-1

In [29]: ar = linspace(1,10,10)
In [30]: ar[(3<ar)*(ar<8)]
Out[30]: array([ 4.,  5.,  6.,  7.])

answered Mar 29 '14 at 11:52

Adobe

12,967
10
85
126

Using multiple levels of boolean index mask in NumPy

5 Answers5