When I have a list of immutable objects, lst
and want to get rid of duplicates, I can just use set(lst)
:
lst = [0,4,2,6,3,6,4,9,2,2] # integers are immutable in python
print(set(lst)) # {0,2,3,4,6,9}
However suppose I have a list of mutable objects, lst
and want to get rid of duplicates. set(lst)
won't work because mutable objects are not hashable - we'd get a TypeError: unhashable type: '<type>'
. What should we do in this case?
For example, suppose we have lst
, a list of dict
s (dict
s are mutable and thus not hashable) and some dicts
occur multiple times in lst
:
d0 = {0:'a', 1:'b', 9:'j'}
d1 = {'jan':1, 'jul':7, 'dec':12}
d2 = {'hello':'hola', 'goodbye':'adios', 'happy':'feliz', 'sad':'triste'}
lst = [d0, d1, d1, d0, d2, d1, d0]
We want to iterate through lst
, but only consider each dict
once. If we do set(lst)
, we'd get a TypeError: unhashable type: 'dict'
. Instead we have to do something like:
def dedup(lst):
seen_ids = set()
for elem in lst:
id_ = id(elem)
if id_ not in seen_ids:
seen_ids.add(id_)
yield elem
Is there a better way to do this???