0

I have implemented an algorithm that solves the problem of finding the kth smallest element in an unsorted array. I have used the heap structure, and optimized the code by relying on this formula,

      k1 = n - k + 1

k1 being the k1th largest element, so I go for the smaller of k and k1.

Still, I couldn't pass the time limit error on an online judge. I don't know if there will be any further more better complexity having in mind that I have to create an array no more than the size of k; maybe less than k possible? Or there is another way to solve this problem other than using the heap structure.

1 <= k <= n <= 105

The code:

#define _CRT_SECURE_NO_WARNINGS
#include <iostream>

using namespace std;

void minHeapify(int arr[], int n, int i)
{
    int largest = i; // Initialize largest as root 
    int l = 2 * i + 1; // left = 2*i + 1 
    int r = 2 * i + 2; // right = 2*i + 2 

    if (l < n && arr[l] < arr[largest])
        largest = l;

    if (r < n && arr[r] < arr[largest])
        largest = r;

    if (largest != i) {
        swap(arr[i], arr[largest]);

        minHeapify(arr, n, largest);
    }
}

void maxHeapify(int arr[], int n, int i)
{
    int smallest = i; // Initialize largest as root 
    int l = 2 * i + 1; // left = 2*i + 1 
    int r = 2 * i + 2; // right = 2*i + 2 

    if (l < n && arr[l] > arr[smallest])
        smallest = l;

    if (r < n && arr[r] > arr[smallest])
        smallest = r;

    if (smallest != i) {
        swap(arr[i], arr[smallest]);

        maxHeapify(arr, n, smallest);
    }
}

void buildMinHeap(int a[], int n) {
    for (int i = n / 2; i >= 0; i--)
        minHeapify(a, n, i);
}

void buildMaxHeap(int a[], int n) {
    for (int i = n / 2; i >= 0; i--)
        maxHeapify(a, n, i);
}

int kthsmallest(int minHeap[], int k, int n) {
    int i, temp;
    for (i = 0; i < k; i++)
        cin >> minHeap[i];

    buildMaxHeap(minHeap, k);

    for (i = k; i < n; i++)
    {
        cin >> temp;
        if (temp < minHeap[0])
        {
            minHeap[0] = temp;
            maxHeapify(minHeap, k, 0);
        }
    }
    return minHeap[0];
}

int kthlargest(int minHeap[], int k, int n) {    
    int i, temp;
    for (i = 0; i < k; i++)
        cin >> minHeap[i];

    buildMinHeap(minHeap, k);

    for (i = k; i < n; i++)
    {
        cin >> temp;
        if (temp > minHeap[0])
        {
            minHeap[0] = temp;
            minHeapify(minHeap, k, 0);
        }
    }
    return minHeap[0];    
}

int main() {//kth smallest element    
    freopen("input.txt", "r", stdin);
    freopen("output.txt", "w", stdout);
    int n, k, k1;
    cin >> n >> k;
    k1 = n - k + 1;//kth smallest element is the same as k1th largest element

    if (k < k1) {
        int *minHeap = new int[k];
        cout << kthsmallest(minHeap, k, n);
    }    
    else {
        int *minHeap = new int[k1];
        cout << kthlargest(minHeap, k1, n);
    }
    return 0;
}

Please if you could help finding a better time complexity?

Problem:

Find the kth largest element of an array

Memory limit: 256 MBs
Time limit: 1 s
Input: input.txt
Output: output.txt


Task:

You are given an array of n integers and a natural k.
You have to find the kth largest element of the array.
You can't create array consisting of more than k elements.

Input:

The first line contains a natural n (1 ≤ n≤105) – the quantity of elements of the array, and the natural k.
The second line contains n numbers – the elements of the array.

Output:

The kth largest element of the array.

Example:

Input        | Output
-------------+-----------
6 2          | 7
7 4 6 3 9 1  | 
trincot
  • 317,000
  • 35
  • 244
  • 286
v_head
  • 759
  • 4
  • 13
  • I see a couple wrong bugs that can give the wrong answer, and the obvious memory leak, but nothing that takes too long. Are you sure you're handling the input correctly? Also, I'm pretty sure that freopen(...stdin) thing is undefined behaviour -- maybe it doesn't work in the online judge environment. – Matt Timmermans Dec 05 '19 at 13:16
  • @trincot exactly – v_head Dec 05 '19 at 17:03
  • @trincot Yeah, I did misread the input. Comment removed. – Jim Mischel Dec 05 '19 at 22:27
  • @trincot, Hi :) Could you help with this problem please https://stackoverflow.com/questions/59476398/maximum-matrix-cost-path-only-two-moves-allowed-two-to-right-one-down-or-two?noredirect=1#comment105130632_59476398 – v_head Dec 25 '19 at 11:29
  • I'll have a look, but could you also review the answers here, and mark one as accepted? It is the fuel that makes this site going. Did you ever find out whether the tests in the challenge were corrupted somehow, or (otherwise) find someone who did pass the tests? – trincot Dec 25 '19 at 11:34
  • @yes I have passed the test. – v_head Dec 25 '19 at 12:11

2 Answers2

1

The time complexity is optimal, but you can make your code a tiny bit more efficient:

  • Don't use recursion, but an iterative solution
  • Don't use swap, but keep the original value in memory while copying child values to their parents and only store the initial value once you have reached the appropriate slot.
  • Don't perform twice 2 * i: the other child node is just the next one.
  • Let the heapify functions take an extra argument, which can be either the current value at index i, or the replacement value for it. This saves one assignment.

Here is how that would look for two heapify functions:

void minHeapify(int arr[], int n, int i, int key) { // add key as parameter
    while (true) { // iterative
        int child = 2 * i + 1; // do this only for left child, and limit number of variables
        if (child+1 < n && arr[child] > arr[child+1]) // get child with least value
            child++; // the right child is just one index further
        if (child >= n || key <= arr[child]) break;
        arr[i] = arr[child]; // don't swap, just copy child value to parent
        i = child; // move down
    }
    arr[i] = key; // finally put the original value in the correct place
}

void maxHeapify(int arr[], int n, int i, int key) { // add key as parameter
    while (true) { // iterative
        int child = 2 * i + 1; // do this only for left child, and limit number of variables
        if (child+1 < n && arr[child] < arr[child+1]) // get child with greatest value
            child++; // the right child is just one index further
        if (child >= n || key >= arr[child]) break;
        arr[i] = arr[child]; // don't swap, just copy child value to parent
        i = child; // move down
    }
    arr[i] = key; // finally put the original value in the correct place
}

void buildMinHeap(int a[], int n) {
    for (int i = n / 2; i >= 0; i--)
        minHeapify(a, n, i, a[i]); // pass a[i] also
}

void buildMaxHeap(int a[], int n) {
    for (int i = n / 2; i >= 0; i--)
        maxHeapify(a, n, i, a[i]); // pass a[i] also
}

int kthsmallest(int heap[], int k, int n) {
    int i, temp;
    for (i = 0; i < k; i++)
        cin >> heap[i];

    buildMaxHeap(heap, k);

    for (i = k; i < n; i++) {
        cin >> temp;
        if (temp < heap[0])
            maxHeapify(heap, k, 0, temp); // pass temp
    }
    return heap[0];
}

int kthlargest(int heap[], int k, int n) {
    int i, temp;
    for (i = 0; i < k; i++)
        cin >> heap[i];

    buildMinHeap(heap, k);

    for (i = k; i < n; i++) {
        cin >> temp;
        if (temp > heap[0])
            minHeapify(heap, k, 0, temp); // pass temp
    }
    return heap[0];
}

In main function you could make a special case for when k == 1 or k == n, so no heap is needed, just min() or max().

One strange thing is that the challenge you link to speaks of "kth largest" while you speak of "kth smallest". Maybe you mixed up.

So here is the code when the job is to return the kth smallest. But please check the challenge whether you should not have done it for kth largest:

int main() {//kth smallest element    
    freopen("input.txt", "r", stdin);
    freopen("output.txt", "w", stdout);
    int n, k, k1;
    cin >> n >> k;
    k1 = n - k + 1;//kth smallest element is the same as k1th largest element

    if (k == 1) {
        int curr, next;
        cin >> curr;
        for (int i = 1; i < n; i++) {
            cin >> next;
            curr = min(curr, next);
        }
        cout << curr;
    } else if (k1 == 1) {
        int curr, next;
        cin >> curr;
        for (int i = 1; i < n; i++) {
            cin >> next;
            curr = max(curr, next);
        }
        cout << curr;
    } else if (k < k1) {
        int *heap = new int[k];
        cout << kthsmallest(heap, k, n);
    } else {
        int *heap = new int[k1];
        cout << kthlargest(heap, k1, n);
    }
    return 0;
}
trincot
  • 317,000
  • 35
  • 244
  • 286
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/203729/discussion-on-answer-by-trincot-kth-smallest-element-cant-create-array-more-th). – Samuel Liew Dec 06 '19 at 07:47
  • 1
    Yeah, I just mix them sometimes, sorry, because I have two challenges, one for smallest and one for largest – v_head Dec 06 '19 at 07:58
  • OK for the flattened `if ... else if`, but I put back in the case for where k1 == 1 and a `max` needs to be done. I am not in favor of capitalising the initial letter of `heap` as it is not a class, but an instance. So I reverted that also. – trincot Dec 06 '19 at 10:04
  • It still hasn't passed that. So in principal, I tested that code for that specific case to get passed. What can be the problem? Maybe the online judge couldn't go over 10^7 in one second? This will be very weird! – v_head Dec 06 '19 at 10:44
  • Maybe there is an unspecified constraint that input values are strictly positive and then you can break out as soon as you read a 1 (when k is 1) – trincot Dec 06 '19 at 11:34
  • I tried this and it gave me wrong answer 21. So, it means it should be as it is, but still with time limit. – v_head Dec 06 '19 at 12:35
  • I think I am getting confused by the fact you're doing two different challenges: one for k-min, one for k-max. It would really be helpful if your question would only be about one. If this is about finding the kth ***least*** value with k=1, and the series start with 1 2 2 2 ... then the answer can never be 21. It should at most be 1. If it turns out that the input values are always strictly positive, then for the kth ***least*** value with k=1 you can stop reading input when you find a 1, as that is the minimum. – trincot Dec 06 '19 at 14:06
  • series are not 1 2 2 2, but n = 10000000, k = 1, series start 2 2 2 2 2 2 ... . I meant by the wrong answer, wrong answer number 21, test case 21 the problem failed. I am just focusing on the kth smallest element, because the largest is just some twists and it is implemented. So, even doing this linear search for k = 1 or k = n, and just for n = 10^7, it didn't pass that particular test case. What else can be thought of? – v_head Dec 06 '19 at 16:53
  • Find someone who was able to pass the test. If for k=1 the thing still times out with the most simple loop that rreads and takes minimum, then I don't know what you could do better. – trincot Dec 06 '19 at 17:12
  • 1
    @Exactly, I really got confused about the problem. I ll see thank you. – v_head Dec 06 '19 at 17:16
0

You're making the assumption that using a smaller heap is always the best choice. You might want to re-think that.

For example, imagine you want to select the 96th smallest number from a list of 100. If you use a heap of size 96, then you'll do:

  1. Build a heap with 96 items. buildHeap is O(n), and in this case n is 96.
  2. Do up to 4 insertions into a heap of 96 items. That'll be 4*log(96).

If you use a heap of size 4, then you'll do:

  1. Build a heap with 4 items.
  2. Do up to 96 insertions into a heap of 4 items. That'll be 96*log(4).

The first option is 96 + 4*log(96). The base-2 log of 96 is about 6.58. So the insertions will cost 26.32, for a total of 122.32.

The second option, with the smaller heap, is 4 + 96*log(4). log(4) is 2, so you end up with 4 + 196, or a total of 196.

The smaller heap is a big loser here.

In general, you want to use the larger heap when (k + (n-k)*log(k)) < ((n-k) + k*log(n-k)).

Also:

The real-world running time of the heap selection algorithm is kind of sensitive to the order in which items are presented. For example, if you're looking for 1000th smallest number in an array of 100,000, it's going to run much faster if the array is in ascending order than if it's in descending order. The reason?

Because in the ascending case, you build your initial heap with the first 1,000 items and then you never have to modify the heap again because there is none of the following items are smaller than the largest item on the heap.

But if the array is in descending order, then every item you look at will be smaller than the largest item on the heap, which means you'd be doing a heap insertion for all 99,000 remaining items.

Imagine how your code would perform if one of the test cases is a large array in descending order.

Unless you've already proven that your way of selecting which heap size to use is clearly better, you might want to consider just going with "select kth smallest," using a maxheap of size k, regardless.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • Interesting, so you 're suggesting for finding the kth smallest element, I would go with a maxheap? what about finding the largest one, What can you suggest? – v_head Dec 05 '19 at 15:16
  • Yes. To find the kth smallest, use a max-heap. To find the kth largest, use a min-heap. – Jim Mischel Dec 05 '19 at 22:29