The usual way to do this is to leverage a set
, which is like a dictionary that has only keys and no values. Dictionaries (and sets) rely on their keys to be "hashable," which means that you can feed the key through some hash function and get the same result every time. In Python you can call this hash function with hash(some_object)
, which internally invokes some_object.__hash__()
.
The problem with this approach is that lists are not hashable. No mutable objects (things you can change with methods like list.append
or set.add
or dict.union
or etc) are. This means you must either check equality by hand, or mutate it into some form that is hashable, use the set, and then mutate it back. I think the latter is probably your best bet.
To that end, let's use a tuple. Tuples are just like lists except they are not idiomatically homogenous (so mixing types is common, not just technically allowed) and their order has semantic meaning. Consider an ordered pair on a plane -- it would matter deeply if the order flipped: (1, 4)
is not the same point as (4, 1)
. They are, however, immutable and hashable.
d = {1: [[1,2],[3,4],[5,6],[7,8],[1,2]],
2: [[5,6],[7,8],[1,2]],
3: [[3,4],[5,6],[3,4]]}
# we'll use a set comprehension here because it's concise
uniques = {tuple(sublst) for lst in d.values() for sublst in lst}
result = [list(tup) for tup in uniques] # then just change them back to lists
Note that the conversion to set and back does lose all ordering. If ordering is important then you'll have to do something like iterate through every sub list, convert it to tuple, check to see if it's already been seen, and if not add it to the seen
set and append it to your final list.
d = {1: [[1,2],[3,4],[5,6],[7,8],[1,2]],
2: [[5,6],[7,8],[1,2]],
3: [[3,4],[5,6],[3,4]]}
seen = set()
result = []
for lst in d.values():
for sublst in lst:
tup = tuple(sublst)
if tup not in seen:
seen.add(tup)
result.append(sublst)