Removing Duplicate Values From List (Python)

Question

I'm try to make an inverted index for some NLP to see how many times a word appears in a document. I'm doing this via a dictionary but my output is like this (here the word man appears in documents 1 and 11)

{'man': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
 1, 1, 1, 1, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11],
 'upon': [1, 1, 1, 3, 3, 3, 1539, 1539, 1539]}

How do I get rid of these duplicate values so I just have

{'man': [1,11], 'upon': [1,3,1539]}

Does this answer your question: https://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-whilst-preserving-order? BTW perhaps the best approach is not to create these lists with duplicates in the first place. — Dani Mesejo, Oct 24 '21 at 23:13

score 2 · Accepted Answer · answered Oct 24 '21 at 23:13

2

Just convert values to sets and then back to lists:

my_dict = {k: list(set(v)) for k, v in my_dict.items()}

answered Oct 24 '21 at 23:13

NotAName

3,821
2
29
44

Thanks for that, I should have known it was something to do with sets as they don't have doubles! – David R Oct 24 '21 at 23:15
Without comprehension: `dict(zip(my_dict.keys(), map(list, map(set, my_dict.values()))))` – flakes Oct 24 '21 at 23:21

Removing Duplicate Values From List (Python)

1 Answers1