0

If you go to leetcode.com/problems/find-the-duplicate-number/solution/ (problem 287) there is the following solution given:

def findDuplicate(self, nums):
    seen = set()
    for num in nums:
        if num in seen:
            return num
        seen.add(num)

The time complexity of this solution is stated as O(n)

I'm trying to figure out why that is the case. The way I'm thinking of it is if you're given the following list:

x= [1,2,3,4,5,6,7,8,9,10,11,....98,99,100,100] i.e. numbers constantly increasing (or not being the same) up to the duplicated number at the end of the array

Then as you iterate through each element, you have an ever increasing size of the set to look through for the duplicate value. Wouldn't this be longer than O(N) time?

Specifically, if you try to look for 98 in the set, you have to look through 97 (N-4) values to see it's not in the set, then add it in. For 99, you look through 98 values, etc. It seems like an O(N^2) solution in the worst case?

  • 2
    Set-lookup is constant O(1) - thats the beauty of sets. No matter how many items in it, time to check if someting is inside is constant. Your algo is O(N) because - worst case - you need to touch each element of your input list exactly once. – Patrick Artner Nov 06 '19 at 22:47
  • Interesting! So if I go to https://wiki.python.org/moin/TimeComplexity and lookup the set function, I see it says for "x in set" the average time complexity is O(1) but the worst case is O(n). Aren't complexities usually written in worst case format? – Lexington John Nov 06 '19 at 22:53
  • @LexingtonJohn see this regarding worst case for `x in set`: https://stackoverflow.com/questions/29004694/sets-python-worst-case-complexity It's very unlikely that you will make a set where everything hashes to the same key. – Mark Nov 06 '19 at 23:04

1 Answers1

0

In a way, your view is right. Taking an example, let the number to be found be 100. SO, looping around with 'for' loop would be O(n) and then finding the value in the set would be O(n)/O(1).

Average time complexity for finding values in set is O(1), while worst case complexity is O(n). In this case, we have the worst case complexity, therefore, complexity os O(n^2).

But if you read the solution given on leetcode, they mention 'amortized constant time complexities', which means that there might be a worst case once in a while. So they basically ignore it and look at the average time complexity which comes out to be O(1).

Therefore the time complexity is O(1).

  • That makes sense! Thank you. I have a coding interview in a few weeks and just wanted to make sure I wasn't going crazy! I'll look into amortized constant time complexities" some more. – Lexington John Nov 06 '19 at 23:01