An OrderedDict
will keep the order and give you unique elements once we map the sublists to tuples to make them hashable, using t[:]
wil allow us to mutate the original object/list.
t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"]]
from collections import OrderedDict
t[:] = map(list, OrderedDict.fromkeys(map(tuple, t)))
print(t)
[['a', 'b'], ['c', 'd'], ['a', 'e'], ['g', 'f']]
For python2 you can use itertools.imap
if you want to avoid creating intermediary lists:
from collections import OrderedDict
from itertools import imap
t[:] = imap(list, OrderedDict.fromkeys(imap(tuple, t)))
print(t)
You can also use the set.add or
logic:
st = set()
t[:] = (st.add(tuple(sub)) or sub for sub in t if tuple(sub) not in st)
print(t)
Which would be the fastest approach:
In [9]: t = [[randint(1,1000),randint(1,1000)] for _ in range(10000)]
In [10]: %%timeit
st = set()
[st.add(tuple(sub)) or sub for sub in t if tuple(sub) not in st]
....:
100 loops, best of 3: 5.8 ms per loop
In [11]: timeit list(map(list, OrderedDict.fromkeys(map(tuple, t))))
10 loops, best of 3: 24.1 ms per loop
Also if ["a","e"]
is considered the same as ["e","a"]
you can use a frozenset:
t = [["a", "b"], ["c", "d"], ["a", "e"], ["f", "g"], ["c", "d"], ["e","a"]]
st = set()
t[:] = (st.add(frozenset(sub)) or sub for sub in t if frozenset(sub) not in st)
print(t)
Output:
[['a', 'b'], ['c', 'd'], ['a', 'e'], ['f', 'g']]
To avoid two calls to tuple you can make a function:
def unique(l):
st, it = set(), iter(l)
for tup in map(tuple, l):
if tup not in st:
yield next(it)
else:
next(it)
st.add(tup)
Which runs a little faster:
In [21]: timeit list(unique(t))
100 loops, best of 3: 5.06 ms per loop