Python - Obtain indices of intersecting values in two arrays

Question

If I have two arrays:

A=[1,2,3,4,5,6,7]  
B=[2,4,7]

I would like to obtain an array C that contains the indices of the the values of B also found in A

C=[1,3,6]

I'm quite new to Python and I'm frustrated of not being able to find an elegant solution to such a simple task without the need of using a loop combined with numpy.where().

Thanks in advance!!

In your example, every value in B is also present in A. Will that always be the case? — kaya3, Mar 09 '20 at 12:01
What should the output be if a `B` element is not in `A`? e.g. `B = [2,44,7]`? Also, what if it's present in multiple locations in `A`? Or are `A` entries unique? — tzaman, Mar 09 '20 at 12:01

score 4 · Accepted Answer · answered Mar 09 '20 at 12:04

4

Here's a linear-time solution: to efficiently test whether an element is in B, convert it to a set first.

B_set = set(B)
C = [i for i, x in enumerate(A) if x in B_set]

For large inputs, this is better than using .index in a loop, since that requires repeatedly searching the list in O(mn) time, where m and n are the size of A and B. In comparison, the solution above takes O(m + n) time to convert to a set and then build the result list.

answered Mar 09 '20 at 12:04

kaya3

47,440
4
68
97

This is better than Guy's approach as `.index` only returns the index of 1st occurrence of the number. If you have duplicates and want a linear time this solution is better.+1 – Ch3steR Mar 09 '20 at 12:14

score 2 · Answer 2 · answered Mar 09 '20 at 12:06

2

You can use np.isin and np.nonzero.

a=np.array([1,2,3,4,5,6,7])
b=np.array([2,4,7])
c=np.nonzero(np.isin(a,b))[0]
# array([1, 3, 6], dtype=int64)

answered Mar 09 '20 at 12:06

Ch3steR

20,090
4
28
58

Do you know the time complexity of `np.isin` for this? I don't see it in the documentation. – Kelly Bundy Mar 09 '20 at 12:31
@HeapOverflow I found this on Numpy documentation. *isin(a, b) is roughly equivalent to np.array([item in b for item in a])* It's equivalent to `O(m*n)`. – Ch3steR Mar 09 '20 at 12:39
2

I wouldn't assume from that part of the docs that it's O(mn) - often docs will give an "equivalent" expression to explain what the behaviour/output is, but that doesn't mean it's equivalent in implementation. – kaya3 Mar 09 '20 at 12:55
Well, it says "roughly", and "equivalent" might just refer to the result, not to the efficiency. – Kelly Bundy Mar 09 '20 at 12:55
From the looks of it, there is [an open pull request](https://github.com/numpy/numpy/pull/12065) to improve the performance of `np.isin` by using a bit array to test membership, though that doesn't necessarily imply the current performance is asymptotically worse. – kaya3 Mar 09 '20 at 13:03
1

@kaya3 Frustrating to see them have such a long discussion but never talk about complexity (as far as I saw). Anyway, did [a test](https://repl.it/repls/GracefulNoisyScreencast) that says it would be 160 billion comparisons per second, which can't be true. – Kelly Bundy Mar 09 '20 at 13:20

nulladdr · Answer 3 · 2021-11-16T17:11:29.197

2

There is a special function in the numpy module for this, intersect1d by passing True in its return_indices argument you get indices of the intersection.

import numpy as np
a = np.array([1,2,3,4,5,6,7])
b = np.array([2,4,7])
c = np.intersect1d(a, b, return_indices=True)[1]
# array([1, 3, 6], dtype=int64)

edited Nov 16 '21 at 17:11

answered Mar 09 '20 at 12:10

nulladdr

741
5
15

I thought your answer was faster than mine. What do you think the reason is? – Ch3steR Mar 09 '20 at 12:47

Guy · Answer 4 · 2020-03-09T12:04:03.227

1

You can iterate over B and use index() on A with the values

c = [A.index(i) for i in B]

As per of @kaya3's comment you can add a check if the value of B present in A in case it can contain non existing values

c = [A.index(i) for i in B if i in A]

edited Mar 09 '20 at 12:04

answered Mar 09 '20 at 11:59

Guy

46,488
10
44
88

score 0 · Answer 5 · answered Mar 09 '20 at 12:03

0

You can use the index function.

A = [1,2,3,4,5,6,7]
B = [2,4,7]
C = [a.index(b) for b in B]
C = list(map(lambda b: a.index(b),B)

answered Mar 09 '20 at 12:03

cmosig

1,187
1
9
24

Python - Obtain indices of intersecting values in two arrays

5 Answers5