-1

I want to turn a list with repeated string like

["ask","a","public","question","ask","a","public","question"]

And the output dictionary should has element of list as key and the occurrence indexes as values.

{"ask":[0,4],"a":[1,5],"public":[2,6],"question";[3,7]}

Any hint? I actually dealing with a bigram perplexity of a corpus, where I have already get the total occurrence of bigram words, i.e., count(B|A), but now I need to get the total occurrence of count(A), where count(A), should be all occurrences of any two words combination start from A. I took the bigram dictionary keys as list and change it to contains only the first words list such as

[['You', 'will'], ['will', 'face'], ['face', 'many'], ['many', 'defeats']

to

['You', 'will', 'face', 'many']

, So I need to calculate all occurrences of each words one by one in that bigram dictionary. I tried several data structures like list, dict, and defaultdict, but they all took so long. I just want to find another datastructure that can deal fastly

Qqqq
  • 19
  • 2

1 Answers1

-2

There are various ways to do this.

This one uses defaultdict.

from collections import defaultdict
result = defaultdict(list)

mylst = ["ask", "a", "public", "question", "ask", "a", "public", "question"]

for index, item in enumerate(mylst):
    result[item].append(index)

print(dict(result))

Another way is to use dict setdefault method.

result = {}
mylst = ["ask", "a", "public", "question", "ask", "a", "public", "question"]

for index, item in enumerate(mylst):
    result.setdefault(item, []).append(index)

print(result)

Another one would be to use try-except with a dictionary.

Vishal Singh
  • 6,014
  • 2
  • 17
  • 33