Why Does Using Hybrid List/Set Comprehension Lower Time Complexity to O(n) Compared to List Comprehension for List Intersection?

Question

The following two functions both evaluate two arrays to output the intersections of integers in the arrays passed to them. However, the second list comprehension which makes use of a set within the comprehension is said to be O(n) while purely list based comprehension is greater than O(n). As a novice to O notation that hasn't been able to find a good answer, it would great if someone could explain why using a set in the comprehension causes this.

Purely List Comprehension:

def intersection(lst1, lst2):
    lst3 = [value for value in lst1 if value in lst2]
    return lst3

List Comprehension Using a Set

def intersection(lst1, lst2):
    temp = set(lst2)
    lst3 = [value for value in lst1 if value in temp]
    return lst3

because membership testing in a `set` is **constant time**, whereas membership testing in a `list` is **linear time**. This has nothing to do with the list comprehension per se — juanpa.arrivillaga, Dec 06 '22 at 20:38
@juanpa.arrivillaga thank you for answering and explaining it isn't to do with the comprehension. What properties of a set allow for their execution with constant time regardless of input size? — SCQs, Dec 06 '22 at 20:54
Their implementation does, they are implemented as hash sets (although performance can degrade to linear, theoretically). There is tons of resources on the web for reading about data structures and algorithms. — juanpa.arrivillaga, Dec 06 '22 at 20:55
@SCQs: See: [Time complexity of python set operations?](https://stackoverflow.com/q/7351459/364696) (which both gives the complexity, and explains in brief how hash tables support this). It's not *guaranteed* `O(1)` membership testing (if literally every value in the `set` hashes to the same "bucket" for a given number of buckets, it's `O(n)`), but if the hash code scheme is designed well and isn't attacked with pathological values, it's typically `O(1)`. — ShadowRanger, Dec 06 '22 at 21:02
The O(1) guarantee can't be made because `set` itself doesn't define the hash function used by any particular type of value stored *in* the set. You have to try to get poor performance, though. — chepner, Dec 06 '22 at 21:24

score 0 · Answer 1 · answered Dec 06 '22 at 20:59

0

if value in lst2 is O(n) but if value in temp is O(1).

answered Dec 06 '22 at 20:59

Mark Ransom

299,747
42
398
622

Why Does Using Hybrid List/Set Comprehension Lower Time Complexity to O(n) Compared to List Comprehension for List Intersection?

1 Answers1