-1

I have a number of numpy arrays a,b,c, ... which all should be trimmed according to a boolean mask array keep or re-arranged according to an index array indices. Doing this on an individual array works find via arr = arr[keep], but is tedious. Therefore, I want to do this for all arrays via a loop, but the following fails

for arr in [a,b,c]:
    arr = arr[keep]
for arr in [a,b,c]:
    arr = arr[indices]

I noted that indexing works okay if I do arr[:] = arr[indices], even if the shapes of arr and indices are different (but agree in the first axis). But this won't work with masking. So how to do this generically (for either masking or indexing) with minimum copies?

For completeness, here is the test case

import numpy as np
a = np.random.random(5)
b = np.array([[1,-1],[2,-2],[3,-3],[4,-4],[4,-4]])

# first test with indexing (for sorting)
i = np.argsort(a)
B = b[i]  # for testing purposes
print(B)
for arr in [a,b]:
    arr = arr[i]
print(b)  # should match B

# second test with boolean (for masking)
k = a < 0.5
B = b[k]  # for testing purposes
print(B)
for arr in [a,b]:
    arr = arr[k]
print(b)  # should match B
Walter
  • 44,150
  • 20
  • 113
  • 196
  • 3
    That's a basic python iteration error. `for i in alist: i=3` does not change anything in the list. – hpaulj Nov 09 '22 at 21:25
  • 1
    create a *new list* and append the resulting new value. – juanpa.arrivillaga Nov 09 '22 at 22:37
  • @hpaulj Yes, but that is (was) not the issue (and I was obviously not fully aware of that). I have edited the question to avoid an explicit list in the example. My usage of the implicit list `[a,b,c]` means that this basic python iteration error will occur. So, such a usage must be avoided in any solution. – Walter Nov 10 '22 at 10:07

1 Answers1

0

Based on this answer to a similar question, I have the following solution.

list = [a,b,c]           # in practice, this could be many more numpy arrays
for i,arr in enumerate(list):
    list[i] = arr[keep]  # assign the list element to the new array, the modification of the old one
a,b,c = list             # unpack the new arrays from the list

There are two crucial differences to my initial attempt. First, by replacing the iterator arr in the assignment with list[i], the actually list entry is changed (avoiding the common python iteration error mentioned in the comment to the question). Second, by declaring the list first, then altering it, and then unpacking to the original arrays, finally also changes the variables a, b, and c to refer to the new/altered arrays.

Of course, for indexing the solution

for arr in [a,b,c]:
    arr[:] = arr[indices]

is more efficient, as no new array is created.

Walter
  • 44,150
  • 20
  • 113
  • 196