In general if you want to find duplicates in a list of dictionaries you should categorize your dictionaries in a way that duplicate ones stay in same groups. For that purpose you need to categorize based on dict
items. Now, since for dictionaries Order is not an important factor you need to use a container that is both hashable and doesn't keep the order of its container. A frozenset()
is the best choice for this task.
Example:
In [87]: lst = [{2: 4, 6: 0},{20: 41, 60: 88},{5: 10, 2: 4, 6: 0},{20: 41, 60: 88},{2: 4, 6: 0}]
In [88]: result = defaultdict(list)
In [89]: for i, d in enumerate(lst):
...: result[frozenset(d.items())].append(i)
...:
In [91]: result
Out[91]:
defaultdict(list,
{frozenset({(2, 4), (6, 0)}): [0, 4],
frozenset({(20, 41), (60, 88)}): [1, 3],
frozenset({(2, 4), (5, 10), (6, 0)}): [2]})
And in this case, you can categorize your dictionaries based on 'un'
key then choose the expected items based on id
:
>>> from collections import defaultdict
>>>
>>> d = defaultdict(list)
>>>
>>> for i in a:
... d[i['un']].append(i)
...
>>> d
defaultdict(<type 'list'>, {'a': [{'un': 'a', 'id': 'cd'}, {'un': 'a', 'id': 'cm'}], 'c': [{'un': 'c', 'id': 'vd'}, {'un': 'c', 'id': 'a'}, {'un': 'c', 'id': 'vd'}], 'b': [{'un': 'b', 'id': 'cd'}, {'un': 'b', 'id': 'cd'}]})
>>>
>>> keeps = {'a': 'cm', 'b':'cd', 'c':'vd'} # the key is 'un' and the value is 'id' should be keep for that 'un'
>>>
>>> [i for key, val in d.items() for i in val if i['id']==keeps[key]]
[{'un': 'a', 'id': 'cm'}, {'un': 'c', 'id': 'vd'}, {'un': 'c', 'id': 'vd'}, {'un': 'b', 'id': 'cd'}, {'un': 'b', 'id': 'cd'}]
>>>
In the last line (the nested list comprehension) we loop over the aggregated dict's items then over the values and keep those items within the values that follows or condition which is i['id']==keeps[key]
that means we will keep the items that has an id
with specified values in keeps
dictionary.
You can beak the list comprehension to something like this:
final_list = []
for key, val in d.items():
for i in val:
if i['id']==keeps[key]:
final_list.append(i)
Note that since the iteration of list comprehensions has performed in C it's very faster than regular python loops and in the pythonic way to go. But if the performance is not important for you you can use the regular approach.