2

If I am using a modified version of quicksort to find the kth smallest item in an array, why is the expected running time O(n) (as stated by Programming Pearls book)?

The algorithm I'm using does the following:

1) Runs quick sort on the array
2) If k is > the correct location of pivot, then run quicksort on the 2nd half.
Otherwise run it on the first half.

I was under the impression this would take O(n * logn) work.

Henley
  • 21,258
  • 32
  • 119
  • 207
  • *quickselect* is the name of the algorithm. [Here's another question about it](http://stackoverflow.com/questions/10846482/quickselect-algorithm-understanding) and [wikipedia's coverage](https://en.wikipedia.org/wiki/Selection_algorithm#Partition-based_general_selection_algorithm) which says that it's O(n) average but O(n^2) worst-case. (I'm assuming the book actually has quickselect, since that's the selection algorithm based on quicksort. I don't have the book to check.) –  Jul 29 '13 at 02:48

2 Answers2

4

This link, may help to understand the proof for randomized quick select, http://pine.cs.yale.edu/pinewiki/QuickSelect,

Idea behind getting Kth order statistic is, select a pivot, and partition the data as suggested by quick sort and find out which partition the element being searched for lies in. Partitioning has the complexity of O(n). After partitioning, you need to pick, just one of resulting partition for further processing unlike quick sort, where you have to process both.

In the below description, I am not trying to prove, but wanted to give intuitive thoughts to understand the expected complexity,

For simplicity, let us assume, on every iteration, pivot splits the array in equal halves, then the complexity is clearly O(n), as

   n + (n/2) + (n/4) ... <= c.n, O(n)

For intuitive understanding, getting a worst case partitioning, where you would have to process (n-1) elements in every iteration happens with the probability of just (1/n). So, the worst case complexity is anyway, O(n^2).

Anyways, if you want a rigorous proof for expected complexity, you can go through the link, provided.

Karthikeyan
  • 990
  • 5
  • 12
  • Very neat! There's log(n) "iterations", but each iteration is half the cost of the previous iteration, so it's something like O(2N+logN) which is just O(N). Yet another way of thinking of it: The complexity is the same as that for binary search through a linked list. – Dave Dopson Jul 29 '13 at 23:42
0

That won't be the algorithm that's described in the book. More likely it does one partition at (1), (2), etc.

user207421
  • 305,947
  • 44
  • 307
  • 483