0

I thought of a sorting algorithm but I am not sure if this already exists.

Say we have a container with n items:

We choose the 3rd element and do a binary search on the first 2, putting it in the correct position. The first 3 items in the container are sorted.

We choose the 4th element and do a binary search on the first 3 and put it in the correct position. Now the first 4 items are sorted.

We choose the 5th element and do a binary search on the first 4 items and put it in the correct position. Now 5 items are sorted.

. . .

We choose the nth element and do a binary search on the other n-1 elements putting it in the correct position. All the items are sorted.

Binary search takes logk for k elements and let's say that the insertion takes constant time. Shouldn't this take:

log2 to put the 3rd element in the correct spot. log3 to put the 4th element in the correct spot. log4 to put the 5th element in the correct spot. . . . log(n-1) to put the nth element in the correct spot.

log2 + log3 + log4 + ... log(n-1) = log((n-1)!) ?

I may be talking nonsense but this looked interesting.

EDIT:

I did not take the insertion time into consideration. What if the sorting was done in a sorting array with gaps between the elements? This would allow for fast inserting without having to shift many elements. After a number of inserts, we could redistribute the elements. Considering that the array is not sorted (we could use a shuffle to ensure this) I think that the results could be quite fast.

Veritas
  • 2,150
  • 3
  • 16
  • 40
  • binary search is not a sort. But if I understand what you're trying to describe it sounds similar to a bubble sort. I think your log(n-1) is off. – hookenz Apr 08 '14 at 20:50
  • we use binary search to find the correct position to put the item in the sorted part of the container. – Veritas Apr 08 '14 at 20:51
  • 2
    The problem is that you can't just find where it goes, you have to move all the other stuff out of the way. Essentially that makes this insertion sort. – pjs Apr 08 '14 at 20:52
  • 2
    I think by binary search, he meant that there is an insertion taking place placed on a searchable index of where we would expect to find the item we are inserting. in which case it would be called insertion sort... Either way it's still N*log(N) – Rusty Weber Apr 08 '14 at 20:53
  • 1
    First issue is that log(A) + log(B) = log(A*B) not log(A+B), making it O(log(N!)) which is O(N log N). I do believe you are basically adding elements to a binary search tree. It would require O(log N) to add the element and O(log N). That is the best way to avoid O(N) for the insertion cost and still maintain the order. – Nuclearman Apr 08 '14 at 20:53
  • Yea the (n-1)! is the factorial. – Veritas Apr 08 '14 at 20:54
  • Ah, I see. It should be log((n-1)!) then not log(n-1)!. I thought you were just using !? as punctuation. In any case, the point remains that adding elements to a balanced [binary search tree](http://en.wikipedia.org/wiki/Binary_search_tree) is closest to your algorithm. – Nuclearman Apr 08 '14 at 20:58
  • @Nuclearman: I think it might be better to write it like this. log(1)+log(2)+...log(n-1)+log(n)=log(n!) SUM(log(n)+log(n)+...log(n), 1-n) = log(n^n) = n log(n) – Rusty Weber Apr 08 '14 at 21:11
  • Using this on a binary tree is more like what i have in mind to be honest. I didn't take into account the complexity of the insertion in this case. – Veritas Apr 08 '14 at 21:13
  • Then you are looking at splaysort to ensure O(N log N) or better in the case of nearly sorted data. – Nuclearman Apr 08 '14 at 21:17
  • To answer the question, yes this has a name. Insertion sort. – Mooing Duck Apr 08 '14 at 21:18

3 Answers3

4

It sounds like insertion sort modified to use binary search. It's fairly well-known, but not particularly well-used (as far as I know), possibly because it doesn't affect the O(n²) worst case, but makes the O(n) best case take O(n log n) instead, and because insertion sort isn't commonly used on anything but really small arrays or those already sorted or nearly sorted.

The problem is that you can't really insert in O(1). Random-access insert into an array takes O(n), which is of course what the well-known O(n²) complexity of insertion sort assumes.

One could consider a data structure like a binary search tree, which has O(log n) insert - it's not O(1), but we still end up with an O(n log n) algorithm.

Oh O(log (n!)) = O(n log n), in case you were wondering about that.

Community
  • 1
  • 1
Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
  • Insertion sort is used as part of Timsort, hybrid sorting algorithm implemented as standard in Java and Phyton, since it's very effective for partially sorted data. – Alex Salauyou Apr 08 '14 at 21:01
  • heap sort is still O(n log n)... It says that on the top of the wiki page you just referenced. typo maybe? It's been too long for me, were you referencing the best case scenario? – Rusty Weber Apr 08 '14 at 21:06
  • heapsort is O(n log n) in the best and worst cases – Mooing Duck Apr 08 '14 at 21:10
  • @MooingDuck: Thanks! I needed the review. – Rusty Weber Apr 08 '14 at 21:14
  • @RustyWeber I actually meant using a BST. I removed my mention of heapsort, so I'm not sure of the exact phrasing I used, but I know heapsort is O(n log n). – Bernhard Barker Apr 08 '14 at 21:15
  • @Veritas: No, you'd need to use a self-balancing tree to avoid O(N^2) worst case. – Nuclearman Apr 08 '14 at 21:19
  • @Veritas Without taking insertion into account, you don't have a complete algorithm yet. Yes, a (self-balancing) binary search tree would result in O(n log n) complexity. [Nuclearman's answer](http://stackoverflow.com/a/22948248/1711796) points out that this is called "Tree sort". – Bernhard Barker Apr 08 '14 at 21:21
  • @Veritas What will happen if you keep inserting at the front? How will you keep track of which elements are populated? Depending on how exactly you do this, this will likely still end up with an O(n²) running time. But that may only be the worst case and I'm not saying it's not possible to get O(n log n) worst-case running time using something like this - both a heap and a BST can be represented using arrays and have O(log n) insertion time, it just, in all likelihood, won't be O(1). Not trying to demotivate you though - there's always room for new and interesting approaches to problems. – Bernhard Barker Apr 08 '14 at 23:32
3

Tree sort (generic binary search tree) and splaysort (splay tree) both use binary search trees to sort. Adding elements to a balanced binary search tree is equivalent to doing a binary search to find where to add the elements then some tree operations to keep the tree balanced. Without a tree of some type, this becomes insertion sort as others have mentioned.

In the worst case the tree can become highly unbalanced, resulting in O(N^2) for tree sort. Using a self-balancing binary search tree yields O(N log N), at least on average. Splay sort is an adaptive sort, making it rather efficient when the input is already nearly sorted.

Nuclearman
  • 5,029
  • 1
  • 19
  • 35
1

I think by binary search, he meant that there is an insertion taking place placed on a searchable index of where we would expect to find the item we are inserting. in which case it would be called insertion sort... Either way it's still N*log(N)

Rusty Weber
  • 1,541
  • 1
  • 19
  • 32