How to find the index of items in list A that are also in list B

Question

I've got listA, which contains

[0, 20, 40, 60, 80, 80, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280, 300, 320, 340]

and listB, which contains

[87, 78343, 100, 38, 100, 20, 80]

I'd like to be able to be able to find the index of the numbers in listA which are also in listB.

For example, listA and listB share 100, 100, 20, and 80. The index of these integers in listA is

[6, 6, 1, 4, 5]

Is there a process that will find this for me so I don't have to do it by hand?

This is similar to this question. The difference is that I have to know the index even when it occurs multiple times in either list, while the answer at that link only works on the first example of the instance: i.e. 80 in listB is at [4] and [5] on listA, but the method described would only return [4].

What if a value shows up multiple times in both lists? For example, `[0, 1, 1, 2]` and `[1, 1, 3, 4]`? Should that be `[1, 1, 2, 2]`? — Rob Watts, Apr 24 '15 at 19:17

Padraic Cunningham · Answer 1 · 2015-04-24T21:25:14.423

Create a dict to hold all indexes including repeated elements then use a list comp adding indexes for common elements:

from collections import defaultdict

d = defaultdict(list)

for i, ele in enumerate(A):
    d[ele].append(i)

print([ele for i in B  for ele in d[i] if i in d])
[6, 6, 1, 4, 5]

If we add a few more 80's you can see it returns all appropriate indexes:

A = [0, 20, 40, 60, 80, 80, 100, 80, 120, 80,140, 160, 180, 200, 220, 240, 260, 280, 300, 320, 340,21]


B = [87, 78343, 100, 38, 100, 20, 80]

from collections import defaultdict

d = defaultdict(list)

for i, ele in enumerate(A):
    d[ele].append(i)

print([ele for i in B for ele in d[i] if i in d])
[6, 6, 1, 4, 5, 7, 9, 21]

For large lists this will be pretty efficient, lookups and 0(1) so the cost of building the dict will be offset for any reasonable size data and scale well.

One thing which is unclear is what should happen if you have duplicate entries in both lists, for instance:

A = [1, 2, 2, 3, 3, 4, 5] 
B = [3, 4, 3, 5]

becomes:

[3, 4, 5 3, 4, 6]

where 3,4 appears twice because 3 is repeated in both.

If that is the case you could keep a count of the elements in b also:

from collections import defaultdict, Counter

d = defaultdict(list)
for i, ele in enumerate(A):
    d[ele].append(i)

cn = Counter(B)
l = []
for i in B:
    if i in d:
        val = d[i]
        l.extend(val.pop(0) if len(val) > 1 and cn[i] > 1 else ele for ele in val)
print(l)
[3, 5, 4, 6]

But then if items appear in B 3 times you are going to get the first value of the index in A for the last occurrence:

 A = [1, 2, 2, 3, 3, 4, 5]
 B = [3, 4, 3, 5, 3]
 [3, 5, 4, 6, 4]

score 1 · Accepted Answer · edited May 23 '17 at 12:28

1

This might be what you actually want:

wanted_indexes = [index for index, value in enumerate(listA) if value in listB]

For your example listA and listB, this will produce

[1, 4, 5, 6]

This gives you the index of all the items in listA that are also in listB. If you really do want duplicates then you could use this:

dups_included = [index for b_value in listB for index, a_value in enumerate(listA) if a_value == b_value]

This will produce the list that you gave as an example:

[6, 6, 1, 4, 5]

Boosting performance:

If you're worried about run time, there are some optimizations you can do for each of these. For the first one, create a set based on listB and use that -

setB = set(listB)
wanted_indexes = [index for index, value in enumerate(listA) if value in setB]

Look-ups are much faster in a set than they are in a list, so unless setB is quite small this should give you a performance boost.

For the version with duplicates, you'd want to create a dictionary that maps each value in listA to a list of the indexes at which it appears. Then when you're iterating through listB you could use this lookup table instead of iterating through listA to get the indexes. This is exactly what Padraic did in his answer.

edited May 23 '17 at 12:28

Community

1
1

answered Apr 24 '15 at 19:12

Rob Watts

6,866
3
39
58

I like this solution, but it would be more efficient if you did `in set(listB)` instead of just `in listB`. – Shashank Apr 24 '15 at 20:05
@Shashank You can't do it quite that simply - using `in set(listB)` would create a new set each time. – Rob Watts Apr 24 '15 at 20:08
That is a valid point, but making the code 2 lines to make an O(n*m) computation O(n+m) is a worthwhile tradeoff in my opinion. So why not just store the set in a variable? – Shashank Apr 24 '15 at 20:10
@Shashank I was going for nice clean one-liners. I did just add some notes about boosting performance in case OP (or someone who sees this later) is interested. – Rob Watts Apr 24 '15 at 20:22
Thanks, I've upvoted your answer now. It's all about tradeoffs in the end. Obviously premature optimization is not desired, but when the optimization involves only one measly extra line of code, I think it should always be favored over a zippy one-liner. Just my opinion, however. – Shashank Apr 24 '15 at 20:27

Kousik · Answer 3 · 2015-04-24T19:56:47.180

Sort Answer:-

>>>reduce(lambda x, y: x+y, [[index for index,value in  enumerate(listA) if item == value] for item in listB if item in listA])
[6, 6, 1, 4, 5]

Long Answer:-

>>>def get_common_items_index(listA,listB):
       result = []
       common_items = [item for item in listB if item in listA]
       for each_item in common_items:
           for index,value in enumerate(listA):
               if value == each_item:
                   result.append(index)
       return result

>>> get_common_items_index(listA,listB)
[6, 6, 1, 4, 5]

score -1 · Answer 4 · edited May 23 '17 at 10:26

-1

Almost similar to vguzmanp answer

r=[]
for i in range(len(listA)):
    for _ in range(listB.count(listA[i])):
        r.append(i)

edited May 23 '17 at 10:26

Community

1
1

answered Apr 24 '15 at 19:43

Costas

343
1
6

How to find the index of items in list A that are also in list B

4 Answers4