3

Let's say I have an data numpy array of length N, and a bit mask array of length N.

data = [1,2,3,4,5,6,7,8,9,0]
mask = [0,1,0,1,0,1,0,1,0,1]

Is there a loopless numpy way to create a new array based off data, such that it takes all the entries of data if and only if masks[i] != 0? Like so:

func(data, mask) = [2,4,6,8,0]

Or equivalently in loop notation:

ans = []
for idx in range(mask):
    if mask[idx]:
        ans.append(data[idx])
ans = numpy.array(ans)

Thanks!

seedship
  • 107
  • 1
  • 8
  • 2
    Your question is even simpler than the linked duplicate, since you are only working in one dimension. the required code is... wait for it... `data[mask != 0]`. And yes, this kind of simplification is one of the core selling points of Numpy. Of course, you do need to start with Numpy arrays, rather than plain Python lists. – Karl Knechtel Mar 27 '21 at 23:06
  • 2
    You should check: [numpy doc with solution](https://numpy.org/doc/stable/reference/generated/numpy.ma.MaskedArray.tolist.html). In your case just do: `np.ma.array(data, mask=mask).data` (change the list to arrays before). – Memristor Mar 27 '21 at 23:10
  • @Memristor when I try `np.ma.array(data, mask=mask).data` I just get the original array. It's not really clear how the link you posted can be used to get the results the OP wants. – Mark Mar 27 '21 at 23:14
  • Yes, it seems `.data` gives you the original array. `.tolist()` gives you the masked list but doesn't *remove* the masked values. They're always filled with `fill_value` (either `None` or a custom value). – tdy Mar 27 '21 at 23:30
  • @MarkM sorry, you must get the the result of the `x = np.ma.array(...)` and then `x[~x.mask].data`, it removes elements with mask 0; Another way is the use of `where`: `np.where([1, 0, 1, 0, 1], [1, 2, 3, 4, 5], 0)` that doesn't removes them but substitutes with 0. – Memristor Mar 28 '21 at 19:49

1 Answers1

6

You can filter numpy arrays with an array of boolean values. You are starting with an array of integers, which you can't use directly, but you can of course interpret the ones and zeros as booleans and then use it directly as a mask:

import numpy as np

data = np.array([1,2,3,4,5,6,7,8,9,0])
mask = np.array([0,1,0,1,0,1,0,1,0,1])

data[mask.astype(bool)]
# array([2, 4, 6, 8, 0])
Mark
  • 90,562
  • 7
  • 108
  • 148