2

I have two arrays:

a = np.array([[1, 2], [3, 4], [5, 6]])

b = np.array([[1, 1, 1, 3, 3],
              [1, 2, 4, 5, 9],
              [1, 2, 3, 4, 5]])

The expected output would match the shape of array 'a' and would be:

array([True, False], [False, True], [True, False])

The first dimension size of arrays a and b always matches (in this case 3).

What I wish to calculate is for each index of each array (0 to 2 as there are 3 dimensions here) is if each number in the array 'a' exists in the corresponding second dimension of array 'b'.

I can solve this in a loop with the following code, but I would like to vectorise it to gain a speed boost but having sat here for several hours, I cannot figure it out:

output = np.full(a.shape, False)
assert len(a) == len(b)
for i in range(len(a)):
    output[i] = np.isin(a[i], b[i])

Thank you for any guidance! Anything would be very appreciated :)

Zachy
  • 88
  • 5

1 Answers1

5

Properly reshape the arrays so they can broadcast correctly while comparing:

(a[...,None] == b[:,None]).any(2)

#[[ True False]
# [False  True]
# [ True False]]
  • a[...,None] adds an extra dimension to the end, with shape (3, 2, 1);
  • b[:,None] inserts a dimension as 2nd axis, with shape (3, 1, 5);
  • When you compare the two arrays, both will be broadcasted to (3, 2, 5) so essentially you compare each element in the row of a to each element in the corresponding row of b;
  • finally you can check if there's any match for every element in a;
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • 1
    This worked perfectly and a beautiful explanation, thanks! Think this has really helped me grasp slicing/indexing with elipsis and colon operators :) It's just funny how my pycharm linter complains I can't call any() on a bool object. Haha. It doesn't realise numpy returns an array and not a bool. – Zachy Jul 24 '21 at 19:09