elements = [
{'a' : 1, 'b' : 2, 'c': 3},
{'a' : 2, 'b' : 2, 'c': 3},
{'a' : 2, 'b' : 3, 'c': 3},
{'a' : 1, 'b' : 2, 'c': 3},
{'a' : 2, 'b' : 2, 'c': 3},
{'a' : 2, 'b' : 2},
{'a' : 1, 'b' : 2, 'c': 3, 'd' : 4},
{'v' : [1,2,3]}
]
Given above list
of dict
in Python, how to deduplicate to the following collection(order doesn't matter) efficiently
result = [
{'a' : 1, 'b' : 2, 'c': 3},
{'a' : 2, 'b' : 2, 'c': 3},
{'a' : 2, 'b' : 3, 'c': 3},
{'a' : 2, 'b' : 2},
{'a' : 1, 'b' : 2, 'c': 3, 'd' : 4},
{'v' : [1,2,3]}
]
The naive method is to use set
, however dict
in Python is unhashable. Right now, my solution is to serialize dict
to String
like json
format (since dict
has no order, two different strings can correspond to same dict. I have to keep some order). However this method has too high time complexity.
My Questions:
How to efficiently deduplicate dictionary in Python?
More generally, is there any method to override a class's hashCode like Java to use
set
ordict
?