0

Given the following list:

list_ex = ['s1', 's2', 's1', 's4', 's2', 's3', 's1']

How can all the indices of all the distinct elements, be found?

For example, for s1, this would be locations [0, 2, 6].

I think that I can do this by running a loop over the distinct elements list(set(list_ex)), and then do a np.where to find the location?

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
P Emt
  • 35
  • 4

3 Answers3

1

You could loop through the elements, building up a dictionary mapping elements to the list of indices for that element. Using a defaultdict of type list is convenient for this because you automatically get an empty list when reading a new element for the first time.

from collections import defaultdict

list_ex = ['s1', 's2', 's1', 's4', 's2', 's3', 's1']

indices = defaultdict(list)

for i, v in enumerate(list_ex):
  indices[v].append(i)

print(indices)

This prints the following:

defaultdict(<class 'list'>, {'s1': [0, 2, 6], 's2': [1, 4], 's4': [3], 's3': [5]})
Matthew Strawbridge
  • 19,940
  • 10
  • 72
  • 93
1

I've found that pandas seems to be optimized for this kind of problem.

import random
import pandas as pd
x = [f's{i}' for i in range(1000)]
l = [random.choice(x) for _ in range(2000000)]
output = pd.DataFrame(l).groupby([0]).indices

It can be 3 times faster than enumerate in optimal scenarios (sizes of groups are large) and 3 times slower in cases where sizes of groups are small (1 - 2 items per group).

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
mathfux
  • 5,759
  • 1
  • 14
  • 34
0

Here is a short solution using a list comprehension:

locations = [el[0] for el in enumerate(list_ex) if el[1] == "s1"]

Explanation

Enumerate creates a list of location / element pairs, it looks like this:

[(0, 's1'), (1, 's2'), (2, 's1'), (3, 's4'), (4, 's2'), (5, 's3'), (6, 's1')]

This code below gets the same result, it's just showing it in a for loop form:

target = 's1'
locations = []

for el in enumerate(list_ex):
    if el[1] == target:
        locations.append(el[0])