0

I "used" to understand Quicksort, and now.. I've gotten myself confused. I know I'm missing something very obvious here and no doubt that Quicksort is O(nlogn), and the n part is easy to see (since we need to use n comparisons on each level to move the elements based on the pivot accordingly, but how is the splitting logn? Shouldn' it be n-1? Example:

                             1, 2, 3, 4
                       1, 2             3, 4 
                     1      2        3        4

We split 1, 2, 3, 4 into 1, 2 and 3, 4 on the first level, that's one split. Then in the second level we split 1, 2 into 1 and 2 and 3, 4 into 3 and 4, that's 2 splits, so 3 splits in total, so n-1 splits?

I don't think this is a duplicate because I am assuming equal splits of the quick-sort algorithm (or rather a generic equal split of a divide and conquer algorithm) What I'm asking is the intuitive explanations why the splitting is log(n) despite the fact when you count up the actual number of splitting calls it is n-1 in any generic divide and conquer alogirhtm

q.Then
  • 2,743
  • 1
  • 21
  • 31
  • Possible duplicate of [Intuitive explanation for why QuickSort is n log n?](http://stackoverflow.com/questions/10425506/intuitive-explanation-for-why-quicksort-is-n-log-n) – Jerry Coffin Nov 02 '15 at 23:44
  • its also good to remember that in reality, quicksort is actually O(n^2) in worst-case scenarios. it is considered O(nlogn) if the pivot is well chosen/it averages out to nlogn the majority of the time – R Nar Nov 02 '15 at 23:45
  • Ok but I assumed equal partitioning anyways in the diagram I drew (and this is for any type of divide and conquer algorithm) and yet the added up splits are n-1 – q.Then Nov 02 '15 at 23:50
  • @JerryCoffin, yes I understand that the level of the tree after splitting is log(n), but I'm log(n) refers to the number of times we had to split the array, in which case we the total number of times we split is 3 (for example to increase the height of the tree (i.e. to split the leaves), we need to pay a split cost on every singe leaf array of that tree, or am I mistaken? – q.Then Nov 03 '15 at 00:00

2 Answers2

2

Okay, after some clarification in the comments, it's apparent what's really being asked here.

The mistake in in looking at the number of times that split is called, which is actually pretty much irrelevant. What's relevant is the total number of primitive operations.

As already noted, the depth of the tree is approximately log2N. At each "layer" of the tree, all elements of the input are inspected (and some may be swapped).

Yes, at the top level you do one call to partition the array into two halves, then at the next layer each of those is divided in half again, and so on. The number of calls to partition (or split, or whatever you prefer to call it) at a given level doesn't really change much of anything though. What matters is that each time you split some of the data, you need to look at every data item in that partition. At the top level, we have one call to partition that looks at the whole array. At the next level down, we have two calls, each of which looks at half the array. At the third, we have four calls, each of which looks at one fourth of the array (and so on).

Note that imperfect splitting doesn't change the number of items we need to look at in each level of the tree. At each level of the tree, we need to look at all the items in all the partitions. If a split isn't at the center, it means we'll have more than Log(N) layers in the tree, but at each level of the tree we still have to look at every input item.

Therefore, the total number of primitive operations is the number of input items multiplied by the number of layers in the tree (which, in the case of perfect partitioning is approximately log2N), giving overall complexity of O(N log N). If the splitting is imperfect, we still look at N items at each level of the tree, but we have a larger number of layers--potentially up to N layers (giving O(N2) worst-case complexity).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
0

By "equal splits", you are getting a balanced binary tree.

You are doing n comparisons per iteration (to perform the "splits").

The question is how many iterations do you need? Look at your diagram -- it is forming a balanced binary tree. How big is a balanced binary tree?

TheGreatContini
  • 6,429
  • 2
  • 27
  • 37
  • Yes I get the height of the balanced tree is log(n), but the number of actual times the split function was called is not log(n), it is 3. We only need to do n comparisons for each level but we call the split function 3 times in order to create the tree, or am I wrong? – q.Then Nov 03 '15 at 00:01
  • I think Jerry answered your question, see his reply above. – TheGreatContini Nov 03 '15 at 00:34