0

This answer works very well for finding indices of items from a list in another list, but the problem with it is, it only gives them once. However, I would like my list of indices to have the same length as the searched for list. Here is an example:

thelist = ['A','B','C','D','E'] # the list whose indices I want
Mylist = ['B','C','B','E'] # my list of values that I am searching in the other list
ilist = [i for i, x in enumerate(thelist) if any(thing in x for thing in Mylist)]

With this solution, ilist = [1,2,4] but what I want is ilist = [1,2,1,4] so that len(ilist) = len(Mylist). It leaves out the index that has already been found, but if my items repeat in the list, it will not give me the duplicates.

durbachit
  • 4,626
  • 10
  • 36
  • 49

4 Answers4

2
thelist = ['A','B','C','D','E']
Mylist = ['B','C','B','E']
ilist = [thelist.index(x) for x in Mylist]

print(ilist)  # [1, 2, 1, 4]

Basically, "for each element of Mylist, get its position in thelist."

This assumes that every element in Mylist exists in thelist. If the element occurs in thelist more than once, it takes the first location.

UPDATE

For substrings:

thelist = ['A','boB','C','D','E']
Mylist = ['B','C','B','E']
ilist = [next(i for i, y in enumerate(thelist) if x in y) for x in Mylist]

print(ilist)  # [1, 2, 1, 4]

UPDATE 2

Here's a version that does substrings in the other direction using the example in the comments below:

thelist = ['A','B','C','D','E']
Mylist = ['Boo','Cup','Bee','Eerr','Cool','Aah']

ilist = [next(i for i, y in enumerate(thelist) if y in x) for x in Mylist]

print(ilist)  # [1, 2, 1, 4, 2, 0]
user94559
  • 59,196
  • 6
  • 103
  • 103
  • Oh I see, got the original question wrong here with the substrings, I am looking for the opposite - suppose `thelist = ['A','B','C','D','E']` and `Mylist = ['Boo','Cup','Bee','Eerr','Cool','Aah']` and the desired output would be `[1,2,1,4,2,0]` – durbachit Jul 01 '17 at 02:57
  • Then just change `if x in y` to `if y in x`. – user94559 Jul 01 '17 at 20:53
1

Below code would work

ilist = [ theList.index(i) for i in MyList ] 
Jay Parikh
  • 2,419
  • 17
  • 13
1

Make a reverse lookup from strings to indices:

string_indices = {c: i for i, c in enumerate(thelist)}
ilist = [string_indices[c] for c in Mylist]

This avoids the quadratic behaviour of repeated .index() lookups.

Ry-
  • 218,210
  • 55
  • 464
  • 476
0

If you data can be implicitly converted to ndarray, as your example implies, you could use numpy_indexed (disclaimer: I am its author), to perform this kind of operation in an efficient (fully vectorized and NlogN) manner.

import numpy_indexed as npi
ilist = npi.indices(thelist, Mylist)

npi.indices is essentially the array-generalization of list.index. Also, it has a kwarg to give you control over how to deal with missing values and such.

Eelco Hoogendoorn
  • 10,459
  • 1
  • 44
  • 42