1

My overall objective is to check whether each row of a big array exists in a small array.

Using in, testing numpy arrays sometimes results in false positives, whereas it returns the correct result for python lists.

item = [1, 2]
small = [[0,2], [5, 0]]
item in small
# False

import numpy as np

item_array = np.array(item)
small_array = np.array(small)
item_array in small_array
# True

Why does in return a false positive when using numpy arrays?

For context, the following is my attempt to check membership of items from one array in another array:

big_array = np.array([[5, 0], [1, -2], [0, 2], [-1, 3], [1, 2]]) 
small_array = np.array([[0, 2], [5, 0]])

# false positive for last item
[row in small_array for row in big_array]
# [True, False, True, False, True]
outis
  • 75,655
  • 22
  • 151
  • 221
Zihao Wang
  • 45
  • 4
  • @Kulasangar It provides another solution. Many thanks! But I cannot understand why my method is wrong. Any ideas? – Zihao Wang Jan 02 '23 at 11:20
  • 5
    The implementation of `thing in arr` for NumPy arrays is basically `(thing == arr).any()`, which is broken nonsense for non-scalar `thing`. – user2357112 Jan 02 '23 at 11:24
  • 2
    See also "[How does `__contains__` work for ndarrays?](/q/18320624/90527)" – outis Jan 03 '23 at 01:09

1 Answers1

3

Let's do the example: np.array([1, 2]) in small_array

It will check if the 1 is anywhere in the small array in the first position (index 0). It is not. Then it checks if the 2 is anywhere in the small array in the second position (index 1). It is! As one of the two returns True, it will return True.

So np.array([i, 2]) in small_array will always return True for any i.

T C Molenaar
  • 3,205
  • 1
  • 10
  • 26