1

There are N distinct numbers which are given not in sorted order. How much time it will take to select a number say which is neither k-th minimum nor k-th maximum?

I tried like this =>

Take initial k + 1 numbers and sort them in O(k log k). Then pick up kth number in that sorted list, that will be neither the kth minimum nor kth maximum .

Hence, time complexity = O(K log k)

Example =>

Select a number which is neither the 2nd minimum nor 2nd maximum.

array[] = {3,9,1,2,6,5,7,8,4}

Take initial 3 numbers or subarray = 3,9,1 and sorted subarray will be = 1,3,9

Now pick up 2nd element 3. Now, 3 is not the 2nd minimum nor 2nd maximum .

Now, time complexity = O(k lg k) = O(2 lg 2) = O(1).

Garrick
  • 677
  • 4
  • 15
  • 34
  • 1
    Your algorithm is wrong -- for example [3, 1, 2] and k=1. Your algorithm selects "3", but it's the maximum. You're almost there though -- I provide a correct version in my answer, which also reduces the runtime to linear. – Paul Hankin Sep 05 '16 at 12:40
  • Thanks, i got my big mistake – Garrick Sep 05 '16 at 12:42
  • Possible duplicate of [How to find the kth largest element in an unsorted array of length n in O(n)?](http://stackoverflow.com/questions/251781/how-to-find-the-kth-largest-element-in-an-unsorted-array-of-length-n-in-on) – Saeid Sep 05 '16 at 12:55
  • 2
    You can't simply replace variables with numbers in `big O` notation. – Saeid Sep 05 '16 at 12:57
  • 2
    @sudomakeinstall2 I don't see how it's a dupe of that question. A solution involves using the linked question as a subroutine, but the overall problem is not the same at all. – Paul Hankin Sep 05 '16 at 12:58
  • @Paul, Will the complexity be same, even if 2k + 1 > N. – Garrick Sep 05 '16 at 15:24

3 Answers3

7

The problem is trivial if N < k. Otherwise there's no k'th largest or smallest element in the array -- so one can pick any element (for example the first) in O(1) time.

If N is large enough you can take any subset of size 2k+1 and choose the median. Then you have found a number that is guaranteed not to be the kth largest or smallest number in the overall array. In fact you get something stronger -- it's guaranteed that it will not be in the first k or last k numbers in the sorted array.

Finding a median of M things can be done in O(M) time, so this algorithm runs in O(k) time.

I believe this is asymptotically optimal for large N -- any algorithm that considers fewer than k items cannot guarantee that it chooses a number that's not the kth min or max in the overall array.

If N isn't large enough (specifically N < 2k+1), you can find the minimum (or second minimum value if k=1) in O(N) time. Since k <= N < 2k+1, this is also O(k).

There are three cases where no solution exists: (k=1, N=1), (k=1, N=2), (k=2, N=2).

If you only consider cases where k <= N, then the complexity of the overall algorithm is O(k). If you want to include the trivial cases too then it's somewhat messy. If I(k<=N) is the function that's 1 when k<=N and 0 otherwise, a tighter bound is O(1 + k*I(k<=N)).

Paul Hankin
  • 54,811
  • 11
  • 92
  • 118
2

I think there many points that must be noticed in your solution:

Firstly it would require to take 2k+1 elements instead of k+1 in your solution. More specifically you take :

array[] = {3,9,1,2,6,5,7,8,4}

Take initial 3 numbers or subarray = 3,9,1 and sorted subarray will be = 1,3,9

Now pick up 2nd element 3. Now, 3 is not the 2nd minimum nor 2nd maximum .

but to check that 3 is not the 2nd minimum nor 2nd you can't do it with your k+1 elements:subarray = 3,9,1 you have to check the array to see what is the 2 max and min and check your solution.

On the other hand by taking 2k+1 elements and sorting them ,since your elements are distinct you would know that the k+1 element is greater from the k first elements and smaller from the k last elements of your sorted subarray.

In your example you could see:

array[] = {3,9,1,2,6,5,7,8,4} subarray[]={3,9,1,2,6} then sort the subarray :{1,2,3,6,9} ,and give as an answer the number 3 .

An example where your solution would not be rigt: array[] = {9,8,2,6,5,3,7,1,4} where your algorithm would return the number 2 which is the second min .

As of terms of complexity .By taking 2k+1 elements it would not change the complexity that you found because it would be O((2k+1)log(2k+1)) which is O(klog(k)).

Clearly if n<2k+1 the above algorithm won't work ,so you will have to sort the entire array which would take nlog n , but in this case n<2k+1 so it O(klogk).

Finally the algorithm based on the above will be O(klog k) .A thing that might be confusing is that the problem has two parameters k,n .If K is much smaller than n this is efficient algorithm since you don't need to look and short the n-size array but when k,n are very close then it is the same as sorting the n-size array .

One more thing that you should understand is that big O notation is way of measuring the time complexity when an input n is given to the algorithm ,and shows the asymptotic behavior of the algorithm for big input n. O(1) denotes that the algorithm is running ALWAYS in constant time .So in the end when you refer:

Now, time complexity = O(k lg k) = O(2 lg 2) = O(1).

This is not Right you have to measure the complexity with k being the input variable and not a constant ,and this shows the behavior of the algorithm for a random input k. Clearly the above algorithm doesn't take O(1) (or else constant time) it takes O(k log(k)).

Finally ,after searching for a better approach of the problem, if you want a more efficient way you could find kth min and kth max in O(n) (n is the size of the array) .And with one loop in O(n) you could simply select the first element which is different from kth min and max. I think O(n) is the lowest time complexity you can get since finding kth min and max take the least O(n).

For how to find kth min,max in O(n) you could see here: How to find the kth largest element in an unsorted array of length n in O(n)?

This solution is O(n) while previous solution was O(klog k) .Now for k parameter close to n ,as explained above it is the same as O(n log(n)) ,so in this occasion the O(n) solution is better .But if most of the times k is much smaller than n then then O(k log k) may be better .The good thing with the O(n) solution (second solution) is that in all cases it takes O(n) regardless to k so it is more stable but as mentioned for small k the first solution may be better (but in the worst case it can reach O(nlogn)).

Community
  • 1
  • 1
coder
  • 12,832
  • 5
  • 39
  • 53
1

You can sort the entire list in pseudo-linear time using radix-sort and select the k-th largest element in constant time.

Overall it would be a worst-case O(n) algorithm assuming the size of the radix is much smaller than n or you're using a Selection algorithm.

O(n) is the absolute lower bound here. There's no way to get anything better than linear because if the list is unsorted you need to at least examine everything or you might miss the element you're looking for.

apokryfos
  • 38,771
  • 9
  • 70
  • 114
  • Radix sort is suitable, when size of integer in bits is given, then only it is best to apply radix sort . – Garrick Sep 07 '16 at 17:00
  • @stackuser it's not an unrealistic assumption in computer science. Most cases it's either 32 bits or 64 bits. At any rate even using a good selection algorithm O(n) is your lower bound for an unsorted list as input. – apokryfos Sep 08 '16 at 08:00