1

I want to filter a list of lists for duplicates. I consider two lists to be a duplicate of each other when they contain the same elements but not necessarily in the same order. So for example

[['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]

should become

[['A', 'B', 'C'], ['D', 'B', 'A']]

since ['C', 'B', 'A'] is a duplicate of ['A', 'B', 'C']. It does not matter which one of the duplicates gets removed, as long as the final list of lists does not contain any duplicates anymore. And all lists need to keep the order of there elements. So using set() may not be an option.

I found this related questions: Determine if 2 lists have the same elements, regardless of order? , How to efficiently compare two unordered lists (not sets)? But they only talk about how to compare two lists, not how too efficiently remove duplicates. I'm using python.

Akut Luna
  • 191
  • 2
  • 10

5 Answers5

3

using dictionary comprehension

>>> data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> result = {tuple(sorted(i)): i for i in data}.values()
>>> result 
dict_values([['C', 'B', 'A'], ['D', 'B', 'A']])
>>> list( result )
[['C', 'B', 'A'], ['D', 'B', 'A']]
sahasrara62
  • 10,069
  • 3
  • 29
  • 44
2

You can use frozenset

>>> x = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]
>>> [list(s) for s in set([frozenset(item) for item in x])]
[['A', 'B', 'D'], ['A', 'B', 'C']]

Or, with map:

>>> [list(s) for s in set(map(frozenset, x))]
[['A', 'B', 'D'], ['A', 'B', 'C']]
The Thonnu
  • 3,578
  • 2
  • 8
  • 30
1

If you want to keep the order of elements:

data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]

seen = set()
result = []
for obj in data:
    if frozenset(obj) not in seen:
        result.append(obj)
    seen.add(frozenset(obj))

Output:

[['A', 'B', 'C'], ['D', 'B', 'A']]
funnydman
  • 9,083
  • 4
  • 40
  • 55
0

Do you want to keep the order of elements?

from itertools import groupby
data = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]

print([k for k, _ in groupby(data, key=sorted)])

Output:

[['A', 'B', 'C'], ['A', 'B', 'D']]
funnydman
  • 9,083
  • 4
  • 40
  • 55
  • Yes, the order of elements should stay the same. ```[['A', 'B', 'C'], ['D', 'B', 'A']]``` is a valid output but ```[['A', 'B', 'C'], ['A', 'B', 'D']]``` is not, since ```['D', 'B', 'A']``` changed to ```['A', 'B', 'D']``` – Akut Luna Aug 07 '22 at 21:12
  • @AkutLuna see this answer https://stackoverflow.com/questions/73268678/remove-lists-with-same-elements-but-in-different-order-from-a-list-of-lists/73271182#73271182 – funnydman Aug 07 '22 at 21:28
0

In python you have to remember that you can't change existing data but you can somehow append / update data.

The simplest way is as follows:

dict = [['A', 'B', 'C'], ['C', 'B', 'A'], ['D', 'B', 'A']]

temp = []
for i in dict:
    if sorted(i) in temp:
        pass
    else:
        temp.append(i)
print(temp)

cheers, athrv