3

I have a python list of dictionaries which could be something like:

l = [{'id': 'scissor'}, {'id': 'cloth'}, {'id': 'scissor'}]

Now, I was wondering if there is an efficient way to remove duplicates from this list. So the result should be something like:

r = [{'id': 'scissor'}, {'id': 'cloth'}]

I tried using frozenset but the dictionary type cannot be hashed. Is there an efficient way to do this from any structures in the python library?

EDIT The items are considered duplicate if the dict are completely the same.

Luca
  • 10,458
  • 24
  • 107
  • 234
  • Under which requirement do you want to remove the dicts? If they have the same keys? or onyl if they are completely the same? – Bernhard Aug 08 '18 at 09:31
  • @Bernhard: If they are completely the same as in the example. – Luca Aug 08 '18 at 09:32
  • 2
    https://stackoverflow.com/questions/9427163/remove-duplicate-dict-in-list-in-python – Joe Aug 08 '18 at 09:33
  • Can we create a `hashset` of all the elements of `l` with any value. The keys/elements of that `hashset` will give us `r` – impossible Aug 08 '18 at 09:33

5 Answers5

4
r = [x for i,x in enumerate(l) if x not in l[:i]]
Elisha
  • 4,811
  • 4
  • 30
  • 46
3

If you don't have to be efficient:

from functools import partial
import json

list(map(json.loads, set(map(partial(json.dumps, sort_keys=True), l))))

If you do have to be efficient:

serialized = map(tuple, map(sorted, map(dict.items, l)))
unique = set(serialized)
result = list(map(dict, unique))
Reut Sharabani
  • 30,449
  • 6
  • 70
  • 88
1

Should work:

l2 = []

for d in l:
    if d not in l2:
        l2.append(d)
Bernhard
  • 1,253
  • 8
  • 18
1

I suggest you the following simplest way:

l = [{'id': 'scissor'}, {'id': 'cloth'}, {'id': 'scissor'}]

r= []
for i in l:
    if i not in r:
        r.append(i)

print(r)   # [{'id': 'scissor'}, {'id': 'cloth'}]
Laurent H.
  • 6,316
  • 1
  • 18
  • 40
0

Set items have to be hashable, which dicts are not. You can use pickle to serialize all the dicts, then use set to obtain unique items, and finally deserialize them back to dicts:

import pickle
print(list(map(pickle.loads, set(map(pickle.dumps, l)))))

This outputs:

[{'id': 'cloth'}, {'id': 'scissor'}]
blhsing
  • 91,368
  • 6
  • 71
  • 106