Why does `in` operator return false positive when used on numpy arrays?

Question

My overall objective is to check whether each row of a big array exists in a small array.

Using in, testing numpy arrays sometimes results in false positives, whereas it returns the correct result for python lists.

item = [1, 2]
small = [[0,2], [5, 0]]
item in small
# False

import numpy as np

item_array = np.array(item)
small_array = np.array(small)
item_array in small_array
# True

Why does in return a false positive when using numpy arrays?

For context, the following is my attempt to check membership of items from one array in another array:

big_array = np.array([[5, 0], [1, -2], [0, 2], [-1, 3], [1, 2]]) 
small_array = np.array([[0, 2], [5, 0]])

# false positive for last item
[row in small_array for row in big_array]
# [True, False, True, False, True]

@Kulasangar It provides another solution. Many thanks! But I cannot understand why my method is wrong. Any ideas? — Zihao Wang, Jan 02 '23 at 11:20
The implementation of `thing in arr` for NumPy arrays is basically `(thing == arr).any()`, which is broken nonsense for non-scalar `thing`. — user2357112, Jan 02 '23 at 11:24
See also "[How does `__contains__` work for ndarrays?](/q/18320624/90527)" — outis, Jan 03 '23 at 01:09

score 3 · Accepted Answer · answered Jan 02 '23 at 11:24

Let's do the example: np.array([1, 2]) in small_array

It will check if the 1 is anywhere in the small array in the first position (index 0). It is not. Then it checks if the 2 is anywhere in the small array in the second position (index 1). It is! As one of the two returns True, it will return True.

So np.array([i, 2]) in small_array will always return True for any i.

Why does `in` operator return false positive when used on numpy arrays?

1 Answers1