I'm looking for an efficient way to find different strings in a list of string lists and return their indices. Here is the code:
inp = [ 'ans1', 'ans2', 'ans3' ]
output = [ [ 'aaa', 'ans1', 'bbb', 'ccc', 'ans2', 'ddd' ],
[ 'bbb', 'aaa', 'ans2', 'ddd', 'ans1', 'aaa' ],
[ 'ddd', 'ccc', 'ans2', 'ans1', 'aaa', 'bbb' ] ]
# expected result
# result = [ [ 1, 4, 3 ], [ 4, 2, 2 ], [ -1, -1, -1 ] ]
Those reported in the result are the indices for the position in the output
list of each string in the inp
list. For example, ans2
is at index 4 in the first sublist, index 2 in the second sublist, and index 2 in the third sublist. Similarly for ans1
. ans3
, however, does not appear in any sublist and, therefore, the returned index is -1
.
What I'm looking for is an efficient way to do this computation (possibly in parallel?) while avoiding the classic for loops that this can clearly be done with.
Some considerations:
output
has shape equal to[ len( inp ), L ]
, whereL
is the size of the dictionary. In this caseL = 5
.