Say, I wish to compute the difference of two lists C = A - B
:
A = [1,2,3,4,5,6,7,8,9]
B = [1,3,5,8,9]
C = [2,4,6,7] #Result
A
and B
are both sorted with unique integers (not sure if there is a way to tell Python about this property of the list). I need to preserve the order of the elements. AFAIK there are two possible ways of doing it
Method 1: Convert B into a set and use list comprehension to generate C:
s = set(B)
C = [x for x in A if x not in s]
Method 2: Directly use list comprehension:
C = [x for x in A if x not in B]
Why is #1
more efficient than #2
? Isn't there an overhead to convert to a set? What am I missing here?
Some performance benchmarks are given in this answer.
UPDATE: I'm aware that a set's average O(1)
lookup time beats that of a list's O(n)
but if the original list A
contains about a million or so integers, wouldn't the set creation actually take longer?