0

i want to find approx median in unsorted list,i know two algorithm

algorithm 1- quickselect

algorithm 2- Median of medians

i can't use quickselect in my project because it take O(n^2) in worst case. i heard about Median of medians,but my colleagues suggest that it takes O(n) with some constant factor.therefore its time complexity is Cn and constant factor is is large compare to quickselect. i want to know what is the constant factor associated with Median of medians ?and why Median of medians not use pseudo median of 9 element ?
or is their any other algorithm to find approx median in linear time O(n) ?

asd
  • 215
  • 3
  • 9
  • see http://stackoverflow.com/questions/9489061/understanding-median-of-medians-algorithm – Aseem Goyal Feb 18 '14 at 12:50
  • possible duplicate of [Finding the median of an unsorted array](http://stackoverflow.com/questions/10662013/finding-the-median-of-an-unsorted-array) – Jim Mischel Feb 18 '14 at 14:19
  • 1
    You're asking 3 rather distinct questions - it would probably be better to separate these into 3 different questions, because asking multiple questions in a question doesn't really work in the [so] format (there's already an answer for the last, so that would be the obvious choice to leave in this question). [Wikipedia](http://en.wikipedia.org/wiki/Median_of_medians) answers your second question, assuming you understand it. The first two questions would probably both be better suited on [cs.se], but might both require a bit of work on your part to try to figure it out yourself. – Bernhard Barker Feb 18 '14 at 16:01

1 Answers1

0

Although I wouldn't be quick to discard quickselect, since its worst-case performance is greatly improbable with properly-chosen pivots...

Perhaps introselect:

Introselect (short for "introspective selection") is a selection algorithm that is a hybrid of quickselect and median of medians which has fast average performance and optimal worst-case performance.

Introselect works by optimistically starting out with quickselect and only switching to the worst-time linear algorithm if it recurses too many times without making sufficient progress. The switching strategy is the main technical content of the algorithm. Simply limiting the recursion to constant depth is not good enough, since this would make the algorithm switch on all sufficiently large lists. Musser discusses a couple of simple approaches:

  • Keep track of the list of sizes of the subpartitions processed so far. If at any point k recursive calls have been made without halving the list size, for some small positive k, switch to the worst-case linear algorithm.
  • Sum the size of all partitions generated so far. If this exceeds the list size times some small positive constant k, switch to the worst-case linear algorithm. This sum is easy to track in a single scalar variable.

Both approaches limit the recursion depth to k ⌈log n⌉ = O(log n) and the total running time to O(n).

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138