0

"Find the K-th largest number in the array" problem:

inputs: [3,2,1,5,6,4], k = 2
outputs: 5

inputs: [3,2,3,1,2,4,5,5,6], k = 4
outputs: 4

I know that this problem can be soleved with quick select and min heap algorithms. However, what I'm focusing here is using max heap. The steps is the following:

1. Build max heap form the whole given array, inplace.
2. Iterate k times. In each iteration, Take and remove the heap top.
3. The last heap top taken at the k-th iteration is the result.

Below is the code in cpp:

void swap(vector<int>& nums, int i, int j) {
    int tmp = nums[i];
    nums[i] = nums[j];
    nums[j] = tmp;
}

void heapify_down(vector<int>& nums, int parent_idx, int end_idx) {
    int left_idx = parent_idx * 2 + 1, right_idx = left_idx + 1;
    while (left_idx <= end_idx) {
        int largeset_idx = parent_idx;
        if (nums[left_idx] > nums[largeset_idx]) largeset_idx = left_idx;
        if (right_idx <= end_idx && nums[right_idx] > nums[largeset_idx]) largeset_idx = right_idx;
        if (largeset_idx != parent_idx) {
            swap(nums, parent_idx, largeset_idx);
            parent_idx = largeset_idx;
            left_idx = parent_idx * 2 + 1;
            right_idx = left_idx + 1;
        }
        else {
            return ;
        }
    }
}

void build_heap(vector<int>& nums) {
    for (int i = nums.size() - 1; i >= 0; i--) heapify_down(nums, i, nums.size() - 1);
}

int findKthLargest(vector<int>& nums, int k) {
    build_heap(nums);
    int cnt = 0, res, cur_end = nums.size() - 1;
    while (cnt != k) {
        res = nums[0];
        cnt += 1;
        swap(nums, 0, cur_end);
        cur_end -= 1;
        heapify_down(nums, 0, cur_end);
    }

    return res;
}

What is the time complextity of this method? I'm using bottom-up approach in the first step, this step should be O(n). However, the while loop makes me confused. The loop execute k times, each loop will call heapify_down which is O(log(n)) comlexity. So is the overall complexity O(n + k * log(n)) = O(max(n, k * log(n)))? Correct me if I'm wrong.

YQ.Wang
  • 1,090
  • 1
  • 17
  • 43
  • `build_heap` alone is `O(n*log n)` already (assuming you are correct in assessing that `heapify_down` is `O(log n)`; `build_heap` calls it `n` times). – Igor Tandetnik Jul 04 '22 at 03:56
  • 4
    @IgorTandetnik tighter bound for `build_heap` will be O(n) https://stackoverflow.com/questions/9755721/how-can-building-a-heap-be-on-time-complexity – Manav Chhibber Jul 04 '22 at 04:09
  • https://en.wikipedia.org/wiki/Selection_algorithm for people skimming the first line. Changing from min to max doesn't really matter (since you're looking for kth largest you'd just use max, but it's the same otherwise) – Kenny Ostrom Jul 04 '22 at 13:58

1 Answers1

0

Building a heap, if you do it right, is O(n) and removing the top is O(log n) and with k being unbound it could be n.

So overall it is O(n + k * log n) == O(n + n * log n) == O(n log n).

Don't you have some more information, like k < log n or something?


When building your heap don't use heapify_down, because that results in O(n * log n). Instead first ignore the last element if the heap has odd size. Then start at i = size / 2 - 1 and swap i, 2 * i and 2 * i + 1 around so the largest is at i. Repeat till i-- = 0. If the size was odd add the last element to the heap.

Goswin von Brederlow
  • 11,875
  • 2
  • 24
  • 42
  • The only constraint upon k is that it is less than n. So the upper bound of the time complexity will no more than `O(n * log n)` and this should be the time complexity for this solution. Am I right? In terms of `heapify_down`, I think I implement the same idea as you described. The only difference is that, I start at the end of the array, and those leaves are called but not go into the while loop. I think the time complexity of my `heapify_down` is still `O(n)`. – YQ.Wang Jul 05 '22 at 15:17
  • If it is then you did it wrong. It should be `O(log n)` otherwise the total becomes `O(n^2)`. – Goswin von Brederlow Jul 05 '22 at 15:37