0

I have the following np array structure:

[[1, 2, 3 ,4]
[5, 7, 8 ,6]
     .
     .
[7, 5, 1 ,0]]

What is want to do is to remove a subarray if thresholds are not met. for example in [5, 7, 8 ,6], i want to delete this array if position 0 is not between 2 and 4. I want to do this action for the whole numpy array and intend on having a threshold on all positions in the sub array.

My thought process is something that is shown below:

for arr in data:
    if arr[0] < 2 or arr[0] > 4:
        np.delete(data, arr)

However, printing data.shape before and after show no difference. Can someone help me?

Thanks!

  • To understand why the `np.delete` approach doesn't work, first read the documentation (you will notice that it creates a *new* array, which you subsequently ignore). However, fixing that would still leave you with a problem because you would be trying to modify a sequence while iterating over it. – Karl Knechtel Apr 16 '21 at 10:25

1 Answers1

2

Creating example data for testing:

>>> import numpy as np
>>> data = np.array([
... [1,2,3,4],
... [5,7,8,9],
... [7,5,1,0]
... ])

You can slice the array to get the first column:

>>> data[:, 0]
array([1, 5, 7])

Figure out which of these first-column values is in range by broadcasting the comparison operators across them (being careful that we can't chain these operators, and must combine them using a bitwise rather than logical AND, because of syntax limitations):

>>> first = data[:, 0]
>>> (4 <= first) & (first <= 6)
array([False,  True, False])

Finally, we can use that to mask the original array:

>>> data[(4 <= first) & (first <= 6)]
array([[5, 7, 8, 9]])
Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153