5

I know that we can find the 2nd largest element in an array of size N in N+log(N)-2 using a "tournament" algorithm. Now I wonder if we can find the k-th largest element using a similar "tournament".

I know there is an O(N) "selection" algorithm to find the k-th largest element. It uses Quick Select with a "good" pivot, which can be found in O(N). We can build also a heap from the array in O(N) and retrieve k element from the heap.

I wonder if there is another approach.

Community
  • 1
  • 1
Michael
  • 41,026
  • 70
  • 193
  • 341
  • 2
    http://en.wikipedia.org/wiki/Selection_algorithm gives a pretty good summary. –  Jun 20 '12 at 14:58
  • Using heap takes O(n + k log n), not O(n), Also can I ask you why you looking for other approaches? (is this a homework?). – Saeed Amiri Jun 20 '12 at 19:50
  • @SaeedAmiri `O(n + k log n)` is exactly `O(n)` :) This is not a homework, just my curiosity. – Michael Jun 21 '12 at 05:12
  • @Michael, No, it's not `O(n)`, if `k` is constant yes, but e.g if k=n/2 it will be O(n logn) far from linear `O` that you mentioned (If it was O(n), sure people weren't crazy to invent selection algorithm). – Saeed Amiri Jun 21 '12 at 07:53
  • @SaeedAmiri Thanks. You are right, I got it now. – Michael Jun 21 '12 at 08:21
  • If you take the deterministic version of selection algorithm (pivot is chosen as median of medians of groups of 5) it can be made to look like a tournament as well though it would be a fairly complex tree. – Amit Prakash Jul 09 '12 at 00:58

1 Answers1

3

I believe you can make this an O(N log k) algorithm: Iterate over the array, and maintain a min-heap of the largest k elements encountered so far. So the first k elements go directly into the heap. Every subsequent element will be compared against the tip of the heap, and if it is larger, the tip will be removed from the heap and the new element inserted, which is an O(log k) operation for a heap of size k. When the algorithm is done, and the sequence had a length of at least k, then the tip of the heap will have the kth largest element, and the rest of the heap the larger elements.

This approach has inferior worst-case behaviour than the median-of-medians O(n) solution, but will be much easier to implement and yield rather good behaviour for small k. So it might be well suited for many practical applications.

MvG
  • 57,380
  • 22
  • 148
  • 276