delete duplicates values and keep order of elements

Question

How we can remove duplicated values between list1 and list2 from list1 while saving its order?

Example of my code:

list1=[(1,1),(1,2),(1,3),(3,4),(5,4),(4,7)]
list2=[(1,1),(1,2)]

list1 = list(set(list1) - set(list2))
print (list1)

output

[(5, 4), (1, 3), (3, 4), (4, 7)]

expected output

[(1,3),(3,4),(5,4),(4,7)]

Possible duplicate of https://stackoverflow.com/questions/10005367/retaining-order-while-using-pythons-set-difference — Abdul Niyas P M, Dec 06 '17 at 17:37
list1 = [x for x in list1 if x not in list2] No need for any of this set stuff... It doesn't help in a situation like this where you want to maintain the order. — GaryMBloom, Dec 06 '17 at 17:44
Possible duplicate of [How do you remove duplicates from a list in whilst preserving order?](https://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-in-whilst-preserving-order) — jpmc26, Dec 06 '17 at 20:27

kindall · Answer 1 · 2017-12-06T18:03:28.410

Use a list comprehension:

list1 = [item for item in list1 if item not in list2]

If list2 could be larger than a few items, then convert it to a set to make checking it faster:

set2 = set(list2)
list1 = [item for item in list1 if item not in set2]

Note that either of the above snippets will keep any duplicate items in list1 (i.e., if two items in list1 are the same, but are not in list2). If you want to eliminate internal duplicates in list1, which is a behavior of your original solution, this will retain only the first occurrence of each:

set1 = set(list1)
set2 = set(list2)
list1 = [set1.remove(item) or item for item in list1
         if item in set1 and item not in set2]

Anshul Goyal · Accepted Answer · 2017-12-06T17:46:51.443

1

Well, a set hashes the elements internally, and so will never be able to maintain the order. If your elements within list are guaranteed to be unique, i.e., they appear only once, you can use the set to filter the elements required:

In [26]: list1 = [(1,1),(1,2),(1,3),(3,4),(5,4),(4,7)]
    ...: list2 = [(1,1),(1,2)]
    ...: unique = set(list1) - set(list2)
    ...: list1 = [x for x in list1 if x in unique]
    ...: print (list1)
    ...: 
[(1, 3), (3, 4), (5, 4), (4, 7)]

In case the same element can be present multiple times within the lists, and you need to keep track of unique number of distinct elements, you will need to maintain a count as well now. Due to which, the logic would look something like:

In [29]: list1=[(1,1),(1,1),(1,2),(1,3),(3,4),(5,4),(4,7)]
    ...: list2=[(1,1),(1,2)]
    ...: 
    ...: from collections import Counter
    ...: 
    ...: count = Counter(list1)
    ...: for element in list2:
    ...:     if element in count:
    ...:         count[element] -= 1
    ...: 
    ...: result = []
    ...: for element in list1:
    ...:     if count.get(element, 0) > 0:
    ...:         count[element] -= 1
    ...:         result.append(element)
    ...: 
    ...: print (result)
    ...: 
[(1, 1), (1, 3), (3, 4), (5, 4), (4, 7)]

edited Dec 06 '17 at 17:46

answered Dec 06 '17 at 17:26

Anshul Goyal

73,278
37
149
186

It seems really overcomplicated to use a `Counter` for this. – kindall Dec 06 '17 at 17:41
I think since (1,1) is duplicated so it should not be a part of output. – Manjunath Dec 06 '17 at 17:42
@kindall Well a dict would be better suited, agreed; Counter just made the creation of the dict easier. Its possible that list1 has the same element `n` times, while list2 has it only `k` times, so it would be needed to check that exactly `n-k` elements are outputted in result. – Anshul Goyal Dec 06 '17 at 17:43
@Manjunath There are 2 outputs, look carefully :) . If the intention is to remove all occurences, then the first solution obviously works. If the intention is to subtract the number of times the element occurs in second list, you would need to maintain a count. Any other solution would be suboptimal for time complexity, and since OP is already using a set, I can't give him a solution which performs worse off than his. – Anshul Goyal Dec 06 '17 at 17:45
@mu無 Cool, that makes sense, sorry. – Manjunath Dec 06 '17 at 17:49

delete duplicates values and keep order of elements

2 Answers2