0

If I have two arrays:

A=[1,2,3,4,5,6,7]  
B=[2,4,7]  

I would like to obtain an array C that contains the indices of the the values of B also found in A

C=[1,3,6]   

I'm quite new to Python and I'm frustrated of not being able to find an elegant solution to such a simple task without the need of using a loop combined with numpy.where().

Thanks in advance!!

Ch3steR
  • 20,090
  • 4
  • 28
  • 58
X.Morales
  • 13
  • 5
  • 1
    In your example, every value in B is also present in A. Will that always be the case? – kaya3 Mar 09 '20 at 12:01
  • 1
    What should the output be if a `B` element is not in `A`? e.g. `B = [2,44,7]`? Also, what if it's present in multiple locations in `A`? Or are `A` entries unique? – tzaman Mar 09 '20 at 12:01

5 Answers5

4

Here's a linear-time solution: to efficiently test whether an element is in B, convert it to a set first.

B_set = set(B)
C = [i for i, x in enumerate(A) if x in B_set]

For large inputs, this is better than using .index in a loop, since that requires repeatedly searching the list in O(mn) time, where m and n are the size of A and B. In comparison, the solution above takes O(m + n) time to convert to a set and then build the result list.

kaya3
  • 47,440
  • 4
  • 68
  • 97
  • This is better than Guy's approach as `.index` only returns the index of 1st occurrence of the number. If you have duplicates and want a linear time this solution is better.+1 – Ch3steR Mar 09 '20 at 12:14
2

You can use np.isin and np.nonzero.

a=np.array([1,2,3,4,5,6,7])
b=np.array([2,4,7])
c=np.nonzero(np.isin(a,b))[0]
# array([1, 3, 6], dtype=int64)
Ch3steR
  • 20,090
  • 4
  • 28
  • 58
  • Do you know the time complexity of `np.isin` for this? I don't see it in the documentation. – Kelly Bundy Mar 09 '20 at 12:31
  • @HeapOverflow I found this on Numpy documentation. *isin(a, b) is roughly equivalent to np.array([item in b for item in a])* It's equivalent to `O(m*n)`. – Ch3steR Mar 09 '20 at 12:39
  • 2
    I wouldn't assume from that part of the docs that it's O(mn) - often docs will give an "equivalent" expression to explain what the behaviour/output is, but that doesn't mean it's equivalent in implementation. – kaya3 Mar 09 '20 at 12:55
  • Well, it says "roughly", and "equivalent" might just refer to the result, not to the efficiency. – Kelly Bundy Mar 09 '20 at 12:55
  • From the looks of it, there is [an open pull request](https://github.com/numpy/numpy/pull/12065) to improve the performance of `np.isin` by using a bit array to test membership, though that doesn't necessarily imply the current performance is asymptotically worse. – kaya3 Mar 09 '20 at 13:03
  • 1
    @kaya3 Frustrating to see them have such a long discussion but never talk about complexity (as far as I saw). Anyway, did [a test](https://repl.it/repls/GracefulNoisyScreencast) that says it would be 160 billion comparisons per second, which can't be true. – Kelly Bundy Mar 09 '20 at 13:20
2

There is a special function in the numpy module for this, intersect1d by passing True in its return_indices argument you get indices of the intersection.

import numpy as np
a = np.array([1,2,3,4,5,6,7])
b = np.array([2,4,7])
c = np.intersect1d(a, b, return_indices=True)[1]
# array([1, 3, 6], dtype=int64)
nulladdr
  • 741
  • 5
  • 15
1

You can iterate over B and use index() on A with the values

c = [A.index(i) for i in B]

As per of @kaya3's comment you can add a check if the value of B present in A in case it can contain non existing values

c = [A.index(i) for i in B if i in A]
Guy
  • 46,488
  • 10
  • 44
  • 88
0

You can use the index function.

A = [1,2,3,4,5,6,7]
B = [2,4,7]
C = [a.index(b) for b in B]
C = list(map(lambda b: a.index(b),B)
cmosig
  • 1,187
  • 1
  • 9
  • 24