quicksort and insertion sort hybrid expected running time

Question

I am self learning CLRS 3rd edition and here is one of the tougher questions that I have encountered along with its answer as a service to all.

7.4-5 We can improve the running time of quicksort in practice by taking advantage of the fast running time of insertion sort when its input is “nearly” sorted. Upon calling quicksort on a subarray with fewer than k elements, let it simply return without sorting the subarray. After the top-level call to quicksort returns, run insertion sort on the entire array to ﬁnish the sorting process. Argue that this sorting algorithm runs in O(nk+nlg(n/k)) expected time. How should we pick k, both in theory and in practice?

Ankush · Accepted Answer · 2012-09-08T19:55:19.457

3

If you eval equation nlgn > nk + nlog(n/k) you get log k > k. Which is impossible. Hence in theory it's not possible. But in practice there are constant factors involved with insertion sort and quicksort. Take a look at the solution discussed in this pdf. You might want to update your answer. .

edited Sep 08 '12 at 19:55

answered Sep 08 '12 at 13:21

Ankush

2,454
2
21
27

score 2 · Answer 2 · answered Apr 25 '13 at 10:08

Actually, the answer is k = 8.

The algorithm you get is the composition of two anonymous functions, one of which is O(nk) and the other which is O(n lg n/k). Those anonymous functions hide average case constants. Insertion sort runs in n^2/4 time in the average case and randomized quicksort runs in 1.386 n lg n in the average case. We want to find a value of k which minimizes the value of nk/4 + 1.386( n lg n/k ) which equals nk/4 + 1.386 n lg n - 1.386 n lg k. This means finding the maximum of the function 1.386 lg k - k/4. That maximum value occurs at k = 8.

Avi Cohen · Answer 3 · 2012-09-08T16:35:05.010

0

A leaf has an equal probability to be of size between 1 to k.
So the expected size of a leaf is k/2.
If the expected size of a leaf is k/2 then we expect n/(k/2)=(2n)/k such leaves.
For simplicity lets say that we expect n/k such leaves and that the expected size of each leaf is k.
The expected running time of INSERTION-SORT is O(n^2).
We found that in exercise 5.2-5 and problem 2-4c.
So the expected running time of INSERTION-SORT usage is O((n/k)*(k^2))=O(nk).
If we expect our partition groups to be of size k then the height of the recursion tree is expected to be lgn-lgk=lg(n/k) since we expect to stop lgk earlier.
There are O(n) operations on each level of the recursion tree.
That leads us to O(nlg(n/k)).
We conclude that the expected running time is O(nk+nlg(n/k)).

In theory, we should pick k=lgn, since this way we get the best expected running time of O(nlgn).

In practice, we should pick k to one of the values around lgn that gives us better performance than just running RANDOMIZED-QUICKSORT.

The second part of the answer uses big-O notation quite loosely, so for a more precise picking of k, please follow the link given in the second answer by Ankush.

edited Sep 08 '12 at 16:35

answered Mar 05 '12 at 17:22

Avi Cohen

3,102
2
25
26

1

Your answer is slightly wrong. The best you can say in theory is that k must be some function such that k=O(lg n). To precisely pick k you have to look at the average case constants involved in insertion sort and quicksort. – Robert S. Barnes Apr 25 '13 at 10:13
Thanks for the feedback and you are right. When I first posted the question and its answer, I was more interested in the first part of the question and in some sense skipped the second part of it. – Avi Cohen Apr 25 '13 at 11:29
Are you also studying in OpenU? We had this question on our Maman this semester. – Robert S. Barnes Apr 26 '13 at 07:01
I graduated a long time ago, I learn this book at my free time. – Avi Cohen Apr 26 '13 at 16:20

quicksort and insertion sort hybrid expected running time

3 Answers3