0

The following two functions both evaluate two arrays to output the intersections of integers in the arrays passed to them. However, the second list comprehension which makes use of a set within the comprehension is said to be O(n) while purely list based comprehension is greater than O(n). As a novice to O notation that hasn't been able to find a good answer, it would great if someone could explain why using a set in the comprehension causes this.

Purely List Comprehension:

def intersection(lst1, lst2):
    lst3 = [value for value in lst1 if value in lst2]
    return lst3

List Comprehension Using a Set

def intersection(lst1, lst2):
    temp = set(lst2)
    lst3 = [value for value in lst1 if value in temp]
    return lst3
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
SCQs
  • 43
  • 6
  • because membership testing in a `set` is **constant time**, whereas membership testing in a `list` is **linear time**. This has nothing to do with the list comprehension per se – juanpa.arrivillaga Dec 06 '22 at 20:38
  • @juanpa.arrivillaga thank you for answering and explaining it isn't to do with the comprehension. What properties of a set allow for their execution with constant time regardless of input size? – SCQs Dec 06 '22 at 20:54
  • Their implementation does, they are implemented as hash sets (although performance can degrade to linear, theoretically). There is tons of resources on the web for reading about data structures and algorithms. – juanpa.arrivillaga Dec 06 '22 at 20:55
  • @SCQs: See: [Time complexity of python set operations?](https://stackoverflow.com/q/7351459/364696) (which both gives the complexity, and explains in brief how hash tables support this). It's not *guaranteed* `O(1)` membership testing (if literally every value in the `set` hashes to the same "bucket" for a given number of buckets, it's `O(n)`), but if the hash code scheme is designed well and isn't attacked with pathological values, it's typically `O(1)`. – ShadowRanger Dec 06 '22 at 21:02
  • The O(1) guarantee can't be made because `set` itself doesn't define the hash function used by any particular type of value stored *in* the set. You have to try to get poor performance, though. – chepner Dec 06 '22 at 21:24

1 Answers1

0

if value in lst2 is O(n) but if value in temp is O(1).

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622