2

I've the following code and I'm trying to get the time complexity.

seen = set()
a=[4,4,4,3,3,2,1,1,1,5,5]
result = []
for item in a:
    if item not in seen:
        seen.add(item)
        result.append(item)
print (result)

As far as my understanding goes as I'm accessing the list the time complexity for that operation would be O(n). As with the if block each time I've a lookup to the set and that would cost another O(n). So is the overall time complexity O(n^2)? Does the set.add() also add to the complexity?

Also, with the space complexity is it O(n)? Because the size of the set increases each time it encounters a new element?

Any input or links to get a proper insight into time and space complexities is appreciated.

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
sharpie
  • 23
  • 1
  • 4

1 Answers1

4

Sets in Python are implemented as hash tables (similar to dictionaries), so both in and set.add() are O(1). list.append() is also O(1) amortized.

Altogether, that means the time complexity is O(n), due to the iteration over a.

Space complexity is O(n) as well, because the maximum space required is proportional to the size of the input.

A useful reference for the time complexity of various operations on Python collections can by found at https://wiki.python.org/moin/TimeComplexity … and the PyCon talk The Mighty Dictionary provides an interesting delve into how Python achieves O(1) complexity for various set and dict operations.

Community
  • 1
  • 1
Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
  • Thank you so much, a quick question: What if I use a `list` instead of a `set` will that change anything? Because `lists` in Python are implemented using `C arrays`(correct me if I'm wrong), and the access time is `O(n)` and assume I'm using `insert` instead of `append` so, insert cost is `O(n)`. Will the over all time still remain the same now? – sharpie Mar 10 '17 at 05:05
  • As noted in the wiki page linked above, `in` for lists is O(N) because the list has to be iterated over until a match is found (and yes, they're C arrays). Since you only care about membership, the order of `seen` isn't important, so you might as well append items to it – which is O(1) amortized because of the way "spare" space is managed – rather than insert them. Either way, the upshot would be an O(n²) algorithm. – Zero Piraeus Mar 10 '17 at 05:19
  • ahh, I get it now. Thank you so much! – sharpie Mar 10 '17 at 05:24
  • You're welcome :-) Aside: I'm guessing this is a homework-related question. That's fine … they're only frowned on around here when people beg for an answer without showing any evidence of having put any thought in for themselves, which is clearly not so in your case. That said, I would highly recommend the PyCon video linked in the answer, which is both fascinating and educational. – Zero Piraeus Mar 10 '17 at 05:25
  • No this ain't homework, I'm preparing for an interview. So, apart from programming I need to understand the run time complexities too. Yea, I'm going through the video now. Thanks again. :) – sharpie Mar 10 '17 at 05:28