1

Let the 2-dimensional array is as below:

In [1]: a = [[1, 2], [3, 4], [5, 6], [1, 2], [7, 8]]
        a = np.array(a)
        a, type(a)
Out [1]: (array([[1, 2],
                 [3, 4],
                 [5, 6],
                 [1, 2],
                 [7, 8]]),
         numpy.ndarray)

I have tried to do this procedure:

In [2]: a = a[a != [1, 2])
        a = np.reshape(a, (int(a.size/2), 2) # I have to do this since on the first line in In [2] change the dimension to 1 [3, 4, 5, 6, 7, 8] (the initial array is 2-dimensional array)
        a
Out[2]: array([[3, 4],
               [5, 6],
               [7, 8]])

My question is, is there any function in NumPy that can directly do that?


Updated Question

Here's the semi-full source code that I've been working on:

from sklearn import datasets
data = datasets.load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['Target'] = pd.DataFrame(data.target)

bucket = df[df['Target'] == 0]
bucket = bucket.iloc[:,[0,1]].values
lp, rp = leftestRightest(bucket)
bucket = np.array([x for x in bucket if list(x) != lp])
bucket = np.array([x for x in bucket if list(x) != rp])

Notes:

leftestRightest(arg) is a function that returns 2 one-dimensional NumPy arrays of size 2 (which are lp and rp). For instances, lp = [1, 3], rp = [2, 4] and the parameter is 2-dimensional NumPy array

sempraEdic
  • 132
  • 9
  • What is `list(bucket) != lp`? If `lp` is an array, then this too is an array. You can't use that in the `if` clause. Why the `list(bucket)`? `bucket` is a `values`, an array. I assume `lp` is too. This may well be a case where applying `all` or `any` on an axis of `bucket!-lp` works. But you need to pay attention to array shape and dtype. – hpaulj Feb 27 '22 at 17:55

3 Answers3

1

There should be a more delicate approach, but here what I have come up with:

np.array([x for x in a if list(x) != [1,2]])

Output

[[3, 4], [5, 6], [7, 8]]

Note that I wouldn't recommend working with list comprehensions in the large array since it would be highly time-consuming.

TheFaultInOurStars
  • 3,464
  • 1
  • 8
  • 29
  • 1
    This is not a good approach for numpy. – mozway Feb 27 '22 at 16:50
  • @mozway Thank you for the note. Would you mind explaining more? Referring to this [link](https://stackoverflow.com/questions/35215161/most-efficient-way-to-map-function-over-numpy-array/46470401#46470401), this shouldn't be the worst. – TheFaultInOurStars Feb 27 '22 at 16:54
  • 1
    You are reconstructing the full array instead of slicing. Try to run some timing on a large array, this should be slower. – mozway Feb 27 '22 at 17:08
  • I have tried this in my case (the explanation has been updated in the question). But it returns an error message. How to solve that? – sempraEdic Feb 27 '22 at 17:27
  • @sempraEdic Which line does the error appear in? Can you point to a part of the line(in case the number of the line would not be of interest here)? – TheFaultInOurStars Feb 27 '22 at 17:31
  • I have marked it with an arrow and a comment – sempraEdic Feb 27 '22 at 17:33
  • 1
    @sempraEdic Not sure, but try editing your code from `np.array([x for x in bucket if list(bucket) != lp])` to `np.array([x for x in bucket if list(x) != lp])` – TheFaultInOurStars Feb 27 '22 at 17:37
  • It seems to returns the same ValueError – sempraEdic Feb 27 '22 at 17:42
  • @sempraEdic It's unfortunate because I am not sure the main reason for the error. You can check it by printing the values of `bucket`, `lp`, and `rp`. I have checked it by generating a dataframe and following your code and it works fine. Note that your two last lines in the update code are the same. – TheFaultInOurStars Feb 27 '22 at 17:53
  • Oh, I got my mistakes. My lp and rp are numpy array, while lp that should be inputted as the argument is list. Thanks a lot for your help – sempraEdic Feb 27 '22 at 18:03
  • @sempraEdic Glad to hear that and you are most welcome. – TheFaultInOurStars Feb 27 '22 at 18:06
0

You're approach is correct, but the mask needs to be single-dimensional:

a[(a != [1, 2]).all(-1)]

Output:

array([[3, 4],
       [5, 6],
       [7, 8]])

Alternatively, you can collect the elements and infer the dimension with -1:

a[a != [1, 2]].reshape(-1, 2)
Kevin
  • 3,096
  • 2
  • 8
  • 37
  • I have tried this too, but it delete an array that shouldn't be expected to be deleted. For instance, a = [[5.1, 3.5], [4.9, 3. ], [4.6, 3.1]]. And let variable lp = [4.3, 3.]. When I run a[(a != lp).all(-1)], it returns [[5.1, 3.5], [4.6, 3.1]] (with a[1] being deleted). – sempraEdic Feb 27 '22 at 17:51
  • And for the larger case, it seems any array in that 2-dimensional array which has element 4.3 or 3. is deleted. How to handle that? – sempraEdic Feb 27 '22 at 17:51
0

the boolean condition creates a 2D array of True/False. You have to apply and operation across the columns to make sure the match is not a partial match. Consider a row [5,2] in your above array, the script you wrote will add 5 and ignore 2 in the resultant 1D array. It can be done as follows:

a[np.all(a != [1, 2],axis=1)]

Usman
  • 1