0

I have been following Neetcode and working on several Leetcode problems. Here on 347 he advises that his solution is O(n), but I am having a hard time really breaking out the solution to determine why that is. I feel like it is because the nested for loop only runs until len(answers) == k.

I started off by getting the time complexity of each individual line and the first several are O(n) or O(1), which makes sense. Once I got to the nested for loop I was thrown off because it would make sense that the inner loop would run for each outer loop iteration and result in O(n*m), or something of that nature. This is where I think the limiting factor of the return condition comes in to act as the ceiling for the loop iterations since we will always return once the answers array length equals k (also, in the problem we are guaranteed a unique answer). What is throwing me off the most is that I default to thinking that any nested for loop is going to be O(n^2), which doesn't seem to be uncommon, but seems to not be the case every time.

Could someone please advise if I am on the right track with my suggestion? How would you break this down?

class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
        
        countDict = {}
        
        frequency = [[] for i in range(len(nums)+1)]
        
        for j in nums:
            countDict[j] = 1 + countDict.get(j, 0)  
            
        for c, v in countDict.items():
            frequency[v].append(c)
            
        answer = []
        
        for n in range(len(frequency)-1, 0, -1):
            for q in frequency[n]:
                print(frequency[n])
                answer.append(q)
                if len(answer) == k:
                    return answer
dSisk
  • 25
  • 6

1 Answers1

1

frequency is mapping between frequency-of-elements and the values in the original list. The number of total elements inside of frequency is always equal or less than the number of original items in in nums (because it is mapping to the unique values in nums).

Even though there is a nested loop, it is still only ever iterating at some O(C*n) total values, where C <= 1 which is equal to O(n).


Note, you could clean up this answer fairly easily with some basic helpers. Counter can get you the original mapping for countDict. You can use a defaultdict to construct frequency. Then you can flatten the frequency dict values and slice the final result.

from collections import Counter, defaultdict


class Solution:
    def top_k_frequent(self, nums: list[int], k: int) -> list[int]:
        counter = Counter(nums)
        frequency = defaultdict(list)
        for num, freq in counter.items():
            frequency[freq].append(num)

        sorted_nums = []
        for freq in sorted(frequency, reverse=True):
            sorted_nums += frequency[freq]

        return sorted_nums[:k]

A fun way to do this in a one liner!

class Solution:
    def top_k_frequent(self, nums: list[int], k: int) -> list[int]:
        return sorted(set(nums), key=Counter(nums).get, reverse=True)[:k]
flakes
  • 21,558
  • 8
  • 41
  • 88
  • When you reference O(C*n), and that C <= 1, what does C represent? Thank you for your answer! – dSisk Oct 02 '22 at 20:01
  • 1
    @dSisk `C` in this case is some constant scalar. Say you loop through a list 2 times, you could say its `O(2*n)` or `O(C*n)` where `C = 2`. By definition this constant doesn't matter because big O describes the growth rate (rather than absolute values). i.e. `O(2n) == O(Cn) == O(n)`. In this case, C would be some value less than or equal to 1, because the number of values iterated is less than the original amount in the `nums` list, however, that value is a constant which is dropped in big O. See also https://stackoverflow.com/questions/29954109/why-do-we-ignore-co-efficients-in-big-o-notation – flakes Oct 02 '22 at 20:11
  • that makes sense. So even though it is a set of nested for loops, at worst we still only iterate for `len(count)` because frequency will only have up to the amount of elements in count. Right? I think I have the idea. Not sure if I explained my thought process well. – dSisk Oct 02 '22 at 22:31
  • 1
    Sounds like you got it. e.g. `nums [1,1,2,3] -> freq {1: [2, 3], 2: [1]}` which has 3 total values, down from 4 in the original list – flakes Oct 02 '22 at 22:40