Removing deuplicates between two lists

Question

I have a list loaded in the from the hard drive, and bring up another list from the internet, both are town names. I want to combine the two lists together and remove any duplicate names so they only appear once in list from the hard drive. I saw an easy way of doing it for and if not to comparing them away and in a similar situation it worked but right now it is not removing the duplicates and instead it is giving me list one and list two, unedited when it gets written back out to the hard drive.

The original concept I used was

for x in townname:
    if x not in towns:
        towns.append(x)
        print(x)

This just copies list2 to to list one and removes nothing. When I switch townname and towns around it does the exact opposite.

How do I get it to remove the duplicate while copying the rest from townname to towns?

If you get this list from the internet it might be in `bytes` so the membership check might fail. Either way, it is worth it to use `print` to see if there's such a discrepancy. — Dimitris Fasarakis Hilliard, Jan 13 '17 at 19:22
I do convert the original list over to str so I can analyze the html code and pull the list from the html code. — confused, Jan 13 '17 at 19:27
What Jim said and convert both to sets. Makes operations like these very easy. — Ma0, Jan 13 '17 at 19:28

score 0 · Answer 1 · answered Jan 13 '17 at 19:29

In cases where unique elements matter, you can use the builtin set() container type. Sets do not allow duplicate elements, and thus are useful for removing duplicates from another container type.

In your case, you can simply combine your two list, convert the combined list to a set(), and convert the set back to a list. Here is a minimal example using two lists:

>>> l = [1, 2, 3, 4, 5]
>>> l2 = [4, 5, 6, 7, 8]
>>> list(set(l + l2))
[1, 2, 3, 4, 5, 6, 7, 8]
>>>

Removing deuplicates between two lists

1 Answers1