Python - Differences between two lists

Question

for practicing purposes, I tried to implement a function that receives two lists as parameters and returns the difference of them. So basically the elements which are the lists have not in common.

I coded the following functions:

list1 = [4,2,5,3,9,11]
list2 = [7,9,2,3,5,1]

def difference(list1,list2):
    return (list(set(list1) - set(list2)))

difference(list1,list2)

AND

def difference_extra_credit(list1,list2):
    return [value for value in list1 if value not in list2]

difference(list1,list2)

--> Basically both codes seem to work but I'm currently facing the problem that the lists need to have the same length in order for the functions to work. If the length is not the same, adding for instance an integer of 100 to list 1, it would not be shown as a difference between the lists if you print the functions.

I didn't manage to find a way to modify the code so that the length of the lists doesn't matter.. Does someone has an idea?

Thanks!

I'm not seeing the length problems you describe. [A `100` element added to `list1` shows up fine in the difference output.](https://ideone.com/J8Rit7) — user2357112, Oct 24 '18 at 19:36
Not to mention that repeated elements in a single list will be lost by converting to a `set`. — roganjosh, Oct 24 '18 at 19:36
The set approach is better. It runs in linear time vs quadratic time. — Håken Lid, Oct 24 '18 at 19:37
There may be other problems - it's not clear whether the operation you have in mind is actually set difference - but they're not related to length. — user2357112, Oct 24 '18 at 19:37
@user2357112 it would be as soon as a duplicate value it encountered in one of the lists being converted to sets? — roganjosh, Oct 24 '18 at 19:39
Sorry my bad, if you add the values to list 2, then they are not shown. It works fine if you add values to list 1 — Lopoo, Oct 24 '18 at 19:39
Possible duplicate of [Get difference between two lists](https://stackoverflow.com/questions/3462143/get-difference-between-two-lists) — Chris Fowl, Oct 24 '18 at 19:41

Håken Lid · Answer 1 · 2019-08-24T10:15:08.663

If you want symmetric difference, use the ^ operator instead of -

def difference(list1, list2):
    return list(set(list1) ^ set(list2))

Here are the four set operators that combine two sets into one set.

| union : elements in one or both of the sets

& intersection : only elements common to both sets

- difference : elements in the left hand set that are not in the right hand set

^ symmetric difference : elements in either set but not in both.

I think this is a more readable way of writing the function

def symmetric_difference(a, b):
    return {*a} ^ {*b}

(* unpacking in set literals requires python 3.5 or later)

Returning a set instead of a list makes it a bit more clear what the function does. The input arguments can be any iterable types, and since set is an unordered data type, returning a set makes it obvious that any ordering in the input data was not preserved.

>>> symmetric_difference(range(3, 8), [1,2,3,4])
{1, 2, 5, 6, 7}
>>> symmetric_difference('hello', 'world')
{'d', 'e', 'h', 'r', 'w'}

score 1 · Answer 2 · answered Oct 24 '18 at 19:39

your both versions aren't symmetrical: if you exchange list1 and list2, the result won't be the same.

If you add a number in list2 (not in list1 as your question states), it's not seen as a difference, whereas it is one.

You want to perform a symmetric difference, so no matter the data in both lists (swapped or not) the result remains the same

def difference(list1,list2):
    return list(set(list1).symmetric_difference(list2))

with your data:

[1, 4, 7, 11]

score -2 · Accepted Answer · answered Oct 24 '18 at 19:48

Trying out your code, it seemed to work fine with me regardless of the length of the lists - when I added 100 to list1, it showed up for both difference functions.

However, there appear to be a few issues with your code that could be causing the problems. Firstly, you accept arguments list1 and list2 for both functions, but these variables are the same name as your list variables. This seems not to cause an issue, but it means that the global variables are no longer accessible, and it is generally a better practice to avoid confusion by using different names for global variables and variables within functions.

Additionally, your function does not take the symmetric difference - it only loops over the variables in the first list, so unique variables in the second list will not be counted. To fix this easily, you could add a line combining your lists into a sum list, then looping over that entire list and checking if each value is in only one of the lists - this would use ^ to do an xor comparison of whether or not the variable is in the two lists, so that it returns true if it is in only one of the lists. This could be done like so:

def difference_extra_credit(l1,l2):
    list  = l1 + l2
    return [value for value in list if (value in l1) ^ (value in l2)]

Testing this function on my own has resulted in the list [4, 11, 7, 1], and [4, 11, 100, 7, 1] if 100 is added to list1 or list2.

Python - Differences between two lists

3 Answers3