23

I have an array of N numbers which are same.I am applying Quick sort on it. What should be the time complexity of the sorting in this case.

I goggled around this question but did not get the exact explanation.

Any help would be appreciated.

Armen Tsirunyan
  • 130,161
  • 59
  • 324
  • 434
Sandeep Pathak
  • 10,567
  • 8
  • 45
  • 57

6 Answers6

40

This depends on the implementation of Quicksort. The traditional implementation which partitions into 2 (< and >=) sections will have O(n*n) on identical input. While no swaps will necessarily occur, it will cause n recursive calls to be made - each of which need to make a comparison with the pivot and n-recursionDepth elements. i.e. O(n*n) comparisons need to be made

However there is a simple variant which partitions into 3 sets (<, = and >). This variant has O(n) performance in this case - instead of choosing the pivot, swapping and then recursing on 0to pivotIndex-1 and pivotIndex+1 to n, it will put swap all things equal to the pivot to the 'middle' partition (which in the case of all identical inputs always means swapping with itself i.e. a no-op) meaning the call stack will only be 1 deep in this particular case n comparisons and no swaps occur. I believe this variant has made its way into the standard library on linux at least.

tobyodavies
  • 27,347
  • 5
  • 42
  • 57
  • 3
    I think that Hoare's original partition made `<=` and `>=` sections, distributing the equal values evenly between them. This has no cost in the average case (distinct data) and guarantees O(N log N) time in the case of equal data – comocomocomocomo Sep 09 '13 at 06:31
  • By now is the 2nd method included in other libraries other than linux library ? – Hashmatullah Noorzai Oct 05 '17 at 03:33
8

The performance of quicksort depends on the pivot selection. The closer the chosen pivot is to the median element, the better is quicksort's performance.

In this specific case you're lucky - the pivot you select will always be a median, since all values are the same. The partition step of quicksort will hence never have to swap elements, and the two pointers will meet exactly in the middle. The two subproblems will have therefore be exactly half the size - giving you a perfect O(n log n).

To be a little more specific, this depends on how well the partition step is implemented. The loop-invariant only needs to make sure that smaller elements are in the left-hand sub-problem, while greater elements are in the right-hand sub-problem. There's no guarantee that a partition implementation never swaps equal elements. But it is always unnecessary work, so no clever implementation should do it: The left and right pointers will never detect an inversion respective the pivot (i.e. you will never hit the case where *left > pivot && *right < pivot) and so the left pointer will be incremented, the right pointer will be decremented every step and they will finally meet in the middle, generating subproblems of size n/2.

ltjax
  • 15,837
  • 3
  • 39
  • 62
  • 6
    because most implementations of QuickSort actually have `n*n` performance in the specific case he is talking about - all elements the same. – tobyodavies Feb 26 '11 at 12:26
  • 3
    Because they generally partition based on `<` and `>=` so while no swaps will occur, it will recurse `n*n` times and do nothing but recurse each time, still leading to `n*n` performance – tobyodavies Feb 26 '11 at 12:33
  • actually, if you choose the middle element and there are only 2 partitions, it will in fact swap equal elements, while this is stupid, many standard libs still did this, at least until quite recently) – tobyodavies Feb 26 '11 at 12:47
  • 2
    @tobyodavies: not when properly implemented for quicksort, I believe. You'll have to show me a popular implementation that doesn't. For example, the VS2010 implemention of quicksort (which is a 'part' of introsort used in std::sort) even establishes an "equal range" for the pivot it picks and will give linear complexity in this specific case. – ltjax Feb 26 '11 at 12:54
  • Yes, however this is not actually quick sort, it is a variant of quick sort of the kind not usually taught in textbooks. It is highly incomplete to leave out _real_ quicksort given we have no idea if this is a practical or theoretical question. – tobyodavies Feb 26 '11 at 12:57
  • 2
    I don't think there's a single rigid definition of the _real_ quicksort. It's more like an algorithm template. It's basically: pick pivot, partition, recurse on subproblems. If you implement all steps properly (in the sense that the complexity doesn't needlessly go up), the complexity in this case will not be quadratic (theoretically _and_ practically). For example, if you just implement `<` and `>=`, you will can even get infinite recursion with the equals-elements case. (all elements will will always be on the right side, the subproblem never shrinks). – ltjax Feb 26 '11 at 13:23
  • 1
    @ltjax, the 2-partition variant always shrinks because the actual pivot is guaranteed to be in the right spot after partitioning, so the partitions always shrink by at least 1 each call. And there is the original version as described by Hoare, taught in all bar one of the texbooks i've seen, that doesn't perform this optimization. – tobyodavies Feb 26 '11 at 14:05
  • "many standard libs still did this, at least until quite recently" -- Rubbish. – Jim Balter Feb 27 '11 at 04:00
2

It depends on the particular implementation.

If there is only one kind of comparison (≤ or <) to determine where the other elements go relative to the pivot, they will all go into one of the partitions, and you will get O(n2) performance, since the problem size will decrease by only 1 each step.

The algorithm listed here is an example (the accompanying illustration are for a different algorithm).

If there are two kinds of comparisons, for example < for elements on the left and > for elements on the right, as is the case in a two-pointer implementation, and if you take care to move the pointers in step, then you might get perfect O(n log n) performance, because half the equal elements will be split evenly in the two partitions.

The illustrations in the link above use an algorithm which doesn't move pointers in step, so you still get poor performance (look at the "Few unique" case).

So it depends whether you have this special case in mind when implementing the algorithm.

Practical implementations often handle a broader special case: if there are no swaps in the partitioning step, they assume the data is nearly sorted, and use an insertion sort, which gives an even better O(n) in the case of all equal elements.

aaz
  • 5,136
  • 22
  • 18
  • if you use the two pointer method you get `O(n)` not `O(n log n)` in this case - each pointer just gets incremented until the end, max `n` comparisons – tobyodavies Feb 26 '11 at 13:03
  • @tobyodavies – I imagine you would typically still recurse into each partition. – aaz Feb 26 '11 at 13:45
  • there is only 1 partion though after the first call - the `=` partition – tobyodavies Feb 26 '11 at 13:56
  • @tobydavies – I was referring to a two-way-comparing quicksort (the < and > comparisons being done on different elements: those unter the left and right pointer respectively). In the [three-way-comparing](http://www.sorting-algorithms.com/quick-sort-3-way) sort which you described O(_n_) is the case, of course. – aaz Feb 26 '11 at 14:11
  • I'm not sure i can see how a QS which checks for `<` and `>` (as opposed to `<` and `>=`) doesn't partition into 3 - if you are doing `<` and `>`, the `=`s have to go somewhere... – tobyodavies Feb 26 '11 at 15:32
  • @tobyodavies – `a[left] > pivot` and `a[right] < pivot` → swap. Well, the signs don't technically matter, it could be `!(a[left] ≤ pivot)` or `pivot < a[left]`, the point is that in one case you're comparing `a[left]` and in the other you're comparing `a[right]`. To get three partitions you need something like `a[left] < pivot` → do something, `a[left] > pivot` → do something else. – aaz Feb 26 '11 at 16:43
  • ah, i think i see... not seen QS implemented this way before... tho thinking about it, the 3 partition version needs 4 pointers - leftmost pivot, rightmost pivot, left next to maybe swap and right next to maybe swap... – tobyodavies Feb 27 '11 at 05:27
1

tobyodavies has provided the right solution. It does handle the case and finish in O(n) time when all the keys are equal. It is the same partitioning as we do in dutch national flag problem

http://en.wikipedia.org/wiki/Dutch_national_flag_problem

Sharing the code from princeton

http://algs4.cs.princeton.edu/23quicksort/Quick3way.java.html

rusty
  • 652
  • 7
  • 21
1

If you implement the 2-way partitioning algorithm then at every step the array will be halved. This is because when identical keys will be encountered, the scan stops. As a result at each step, the partitioning element will be positioned at the center of the subarray thereby halving the array in every subsequent recursive call. Now, this case is similar to the mergesort case which uses ~N lg N compares to sort an array of N elements. Ergo for duplicate keys, the traditional 2-way partitioning algorithm for Quicksort uses ~N lg N compares, thereby following a linearithmic approach.

Farhan Haque
  • 111
  • 1
  • 3
0

Quick Sort code is done using "partition" and "quicksort" functions.

Basically, there are two best ways for implementing Quicksort.

The difference between these two is only the "partition" function,

1.Lomuto

2.Hoare

With a partitioning algorithm such as the Lomuto partition scheme described above (even one that chooses good pivot values), quicksort exhibits poor performance for inputs that contain many repeated elements. The problem is clearly apparent when all the input elements are equal: at each recursion, the left partition is empty (no input values are less than the pivot), and the right partition has only decreased by one element (the pivot is removed). Consequently, the Lomuto partition scheme takes quadratic time to sort an array of equal values.

So, this takes O(n^2) time using the Lomuto partition algorithm.

By using the Hoare partition algorithm we get the best case with all the array elements equal. The time complexity is O(n).

Reference: https://en.wikipedia.org/wiki/Quicksort

  • You're not wrong, but your efforts would be more productively spent on newer questions that don't already have good answers. – John Bollinger Jun 15 '21 at 14:03
  • @Blastfurnace yup, I thought no one here gave the answer in nice and simple English. So, I gave it a try. I am new to this can I delete this answer? – pradeep reddy Jun 15 '21 at 17:32