Since you used set
and frozenset
I assume you care about efficiency and don't want an O(n**2)
algorithm.
In that case you don't just need to convert the dictionaries to frozensets, but also convert the lists to an hashable class. You could use tuple
s.
Assuming your inputs:
first_dict = [{"a": "abcd","b":"defg", "c":["fng","xvg"]}, {"a": "stg","b":"klsm", "c":["xzy"]}]
second_dict = [{"a": "abcd","b":"defg", "c":["fng","xvg"]}]
You could use:
def convert(dictionary):
return frozenset((key, tuple(value) if isinstance(value, list) else value) for key, value in dictionary.items())
def convert_back(frozset):
return dict((key, list(value) if isinstance(value, tuple) else value) for key, value in frozset)
i_set = { convert(row) for row in first_dict }
a_set = { convert(row) for row in second_dict }
result = [convert_back(i) for i in i_set - a_set]
Then the result would be:
[{'a': 'stg', 'c': ['xzy'], 'b': 'klsm'}]
However this won't discriminate between tuples that have been tuples in the original and tuples that have been lists in the original.
You could solve that (if you're sure that the dictionary and its contents are not mutated during the operations) by wrapping the dictionaries in your own hashable class. This has the advantage that you don't need to convert them back afterwards, you can simply "unwrap" the value.
class HashableDictionaryWithListValues:
def __init__(self, dictionary):
converted = frozenset((key, tuple(value) if isinstance(value, list) else value) for key, value in dictionary.items())
self._hash = hash(converted)
self._converted = converted
self.dictionary = dictionary
def __hash__(self):
return self._hash
def __eq__(self, other):
return self._converted == other._converted
i_set = { HashableDictionaryWithListValues(row) for row in first_dict }
a_set = { HashableDictionaryWithListValues(row) for row in second_dict }
result = [i.dictionary for i in i_set - a_set]
The result will be:
[{'a': 'stg', 'b': 'klsm', 'c': ['xzy']}]
Although you could also avoid the whole conversion/wrapping and do an O(n**2)
approach:
def difference(first, second):
for item in first:
if item not in second:
yield item
list(difference(first_dict, second_dict))
Which also gives the expected result:
[{'a': 'stg', 'b': 'klsm', 'c': ['xzy']}]