5

For example, given:

['A', 'B', 'A', 'B']    

I want to have:

{'A': [0, 2], 'B': [1, 3]}

I tried a loop that goes like; add the index of where the character is found, then replace it with '' so the next time the loop goes through, it passes on to the next character.

However, that loops doesn't work for other reasons, and I'm stuck with no idea how to proceed.

Iron Fist
  • 10,739
  • 2
  • 18
  • 34
Long Vuong
  • 181
  • 1
  • 1
  • 9

3 Answers3

10

Use enumerate and setdefault:

example = ['a', 'b', 'a', 'b']
mydict = {}
for idx, item in enumerate(example):
     indexes = mydict.setdefault(item, [])
     indexes.append(idx)
willnx
  • 1,253
  • 1
  • 8
  • 14
  • 4
    ​​​​​​​​​​​​​​​You can also `from collections import defaultdict` and use `mydict = defaultdict(list)`, then you don't need run `mydict.setdefault(item, [])` yourself and I think it would be more Pythonic. – Remi Guan Apr 04 '16 at 05:41
  • But then you have to import it :P Good point - if you want to submit an edit with an example of using that or setdefault, I'll accept it. – willnx Apr 04 '16 at 05:42
  • ​​​​​​​​​​​​​​​Well, that's not [*the correct usage of edit*](https://stackoverflow.com/help/privileges/edit) anyway. So I will not submit an edit. But anyway, here's [the documentation of `collections.defaultdict`](http://docs.python.org/3/library/collections.html#collections.defaultdict). – Remi Guan Apr 04 '16 at 05:49
6

A simple dictionary comprehension should do the trick:

{key: [index for index, x in enumerate(my_list) if x == key] for key in my_list}

A simple trial:

>>>> my_list = ['A','B','A','B']
>>>> {key: [index for index, x in enumerate(my_list) if x == key] for key in my_list}
>>>> {'A': [0, 2], 'B': [1, 3]}

How It Works

List comprehensions are often used in Python as syntactic sugar for a for loop. Instead of writing

my_list = []
for item in range(10):
    my_list.append(item)

list comprehensions essentially let you condense this series of statements into a single line:

my_list = [item for item in range(10)]

Whenever you see a list comprehension, you should remember that it's just a condensed version of the original three line statement. They are effectively identical - the only benefit offered here is conciseness.

A similar, related species is the dictionary comprehension. It is similar to the list comprehension, except that it allows you to specify both the keys and values at the same time.

An example of a dictionary comprehension:

{k: None for k in ["Hello", "Adele"]}
>>>> {"Hello": None, "Adele": None}

In the answer I provide, I have simply used a dictionary comprehension that

  • Pulls out keys from my_list
  • Assigns a list of indices for each key from my_list as the corresponding value

Syntactically, it expands out into a fairly complicated program that reads like this:

my_dict = {}
for key in my_list:
    indices = []
    for index,value in enumerate(my_list):
         if value == key:
              indices.append(index)
    my_dict[key] = indices

Here, enumerate is a standard library function that returns a list of tuples. The first element of each tuple refers to an index of the list, and the second element refers to the value at that index in the list.

Observe:

 enumerate(['a','b','a','b'])
 >>>> [(0,'a'),(1,'b'),(2,'b'),(3,'b')]

That is the power of enumerate.

Efficiency

As always, premature optimisation is the root of all evil. It is indeed true that this implementation is inefficient: it duplicates work, and runs in quadratic time. The important thing, however, is to ask if it is okay for the specific task you have. For relatively small lists, this is sufficient.

You can look at certain optimisations. @wilinx's way works well. @Rob in the comments suggests iterating over set(my_list), which prevents duplicated work.

Community
  • 1
  • 1
Akshat Mahajan
  • 9,543
  • 4
  • 35
  • 44
  • That comprehension duplicates work. Maybe `... for key in set(my_list)` ? – Robᵩ Apr 04 '16 at 05:33
  • This is new to me, could you walk me through how it works? – Long Vuong Apr 04 '16 at 05:34
  • 1
    ​​​​​​​​​​​​​​​Hmm...note that since this way will loop over the list again and again, I think it would be slower than [willnx's way](http://stackoverflow.com/a/36395180/5299236). – Remi Guan Apr 04 '16 at 05:39
  • Optimization isn't a concern, plus this way seem relatively a lot simpler. – Long Vuong Apr 04 '16 at 05:40
  • @LtotheV: ​​​​​​​​​​​​​​​Actually this way is more complex. Because this way calls `enumerate(my_list)` "the length of `my_list` + 1" times. But willnx's way only has to call `enumerate(my_list)` one time. That's the different. But yeah, this way is a one-line version though. But I won't recommend use this way if your list is little big. – Remi Guan Apr 04 '16 at 05:46
  • @LtotheV Updated with a full explanation of what is going on. The other commenters are right: this is not the most efficient algorithm. But if your lists aren't massive, 1000 element long lists, this should work well. – Akshat Mahajan Apr 04 '16 at 05:52
  • Writing simpler code that runs in linear time rather than more complex, less readable code that runs in quadratic time is not premature optimisation. It's writing good code rather than bad code. – Sven Marnach Apr 06 '16 at 14:36
  • @SvenMarnach: Then I suspect we disagree on readability here. – Akshat Mahajan Apr 06 '16 at 14:43
  • @SvenMarnach Why do you think this code is more complex than what other people have put down? It's a plain dictionary comprehension that does exactly what it says on the tin and is eaaily understood in a single glance. It is a naive algorithm, but it's hardly what anyone would call less readable. – Akshat Mahajan Apr 06 '16 at 14:55
  • It's not a plain dictionary comprehension. It's a dictionary comprehension with a nested list comprehension. In plain English, the algorithm is "for each element in the list, find the indices of all occurrences of that element in the list, and add the list of indices to the dictionary under the correct key." The alternative algorithm in plain English is "For each element, append its index to the list of indices for that element." I personally find the latter version less complex, and the corresponding code far easier to read. – Sven Marnach Apr 06 '16 at 15:41
  • How can I add another list element to such a structure? @AkshatMahajan –  Nov 23 '20 at 16:27
  • If your dictionary is called `mydict`, the key is `x` and the element you want to add to that key's list is `foo`: `mydict[x].append(foo)`. – Akshat Mahajan Nov 23 '20 at 19:01
3

Why not use defaultdict from itertools instead:

>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> 
>>> for i,x in enumerate(l):
        d[x].append(i)


>>> d
defaultdict(<class 'list'>, {'A': [0, 2], 'B': [1, 3]})
Iron Fist
  • 10,739
  • 2
  • 18
  • 34