1

Is there a sort of an array that works in O(n*log(n)) worst case time complexity?

I saw in Wikipedia that there are sorts like that, but they are unstable, what does that mean? Is there a way to do in low space complexity?

Is there a best sorting algorithm?

Vadiklk
  • 3,696
  • 5
  • 28
  • 44
  • 6
    There is no "best" sorting algorithm. Which one is better depends very much on the circumstances. AFAIK Bogosort is the *worst* sorting algorithm though. – harold Oct 19 '11 at 09:08
  • 1
    @harold: I don't know, you could modify Bogosort, to bias the shuffle against producing the correct order ;-) – Steve Jessop Oct 19 '11 at 09:42
  • @Harold - but Quantum Bogosort is the best. – Damien_The_Unbeliever Oct 19 '11 at 09:56
  • Trouble with Quantum Bogosort is that you either know what the values are, or whether they're sorted, but not both. Spaghetti sort is a more practical `O(1)` sort. Admittedly not very *general*, but at least it can be implemented. – Steve Jessop Oct 19 '11 at 10:03
  • this is 4 questions in one, pLease split it up. THe 3rd (space complexity) is not clear at all. – Tomas Oct 19 '11 at 20:39
  • The problem with quantum bogosort is that the number of universes in which your computer made a computation error vastly outnumbers the one universe in which your list came out sorted, so it's merely a very good algorithm for causing processor failures. @SteveJessop Wouldn't that be a Heisenberg Bogosort? – Nick Johnson Oct 25 '11 at 04:04

6 Answers6

4

An algorithm that requires only O(1) extra memory (so modifying the input array is permitted) is generally described as "in-place", and that's the lowest space complexity there is.

A sort is described as "stable" or not, according to what happens when there are two elements in the input which compare as equal, but are somehow distinguishable. For example, suppose you have a bunch of records with an integer field and a string field, and you sort them on the integer field. The question is, if two records have the same integer value but different string values, then will the one that came first in the input, also come first in the output, or is it possible that they will be reversed? A stable sort is one that guarantees to preserve the order of elements that compare the same, but aren't identical.

It is difficult to make a comparison sort that is in-place, and stable, and achieves O(n log n) worst-case time complexity. I've a vague idea that it's unknown whether or not it's possible, but I don't keep up to date on it.

Last time someone asked about the subject, I found a couple of relevant papers, although that question wasn't identical to this question:

How to sort in-place using the merge sort algorithm?

As far as a "best" sort is concerned - some sorting strategies take advantage of the fact that on the whole, taken across a large number of applications, computers spend a lot of time sorting data that isn't randomly shuffled, it has some structure to it. Timsort is an algorithm to take advantage of commonly-encountered structure. It performs very well in a lot of practical applications. You can't describe it as a "best" sort, since it's a heuristic that appears to do well in practice, rather than being a strict improvement on previous algorithms. But it's the "best" known overall in the opinion of people who ship it as their default sort (Python, Java 7, Android). You probably wouldn't describe it as "low space complexity", though, it's no better than a standard merge sort.

Community
  • 1
  • 1
Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
2

You can check out between mergesort, quicksort or heapsort all nicely described here.

There is also radix sort whose complexity is O(kN) but it takes full advantage of extra memory consumption.

You can also see that for smaller collections quicksort is faster but then mergesort takes the lead but all of this is case specific so take your time to study all 4 algorithms

Yurii Hohan
  • 4,021
  • 4
  • 40
  • 54
  • 2
    A binary radix sort doesn't require any extra memory (well, O(1) for a few variables of course), which is nice. Unavoidably though, radix sort takes advantage of the structure of the data, it's not a comparison sort, and comparison sorts are what people normally care about for the purposes of these complexity analyses, since they're the most general-purpose. – Steve Jessop Oct 19 '11 at 09:44
2

For the question best algorithm, the simple answer is, it depends.It depends on the size of the data set you want to sort,it depends on your requirement.Say, Bubble sort has worst-case and average complexity both О(n2), where n is the number of items being sorted. There exist many sorting algorithms with substantially better worst-case or average complexity of O(n log n). Even other О(n2) sorting algorithms, such as insertion sort, tend to have better performance than bubble sort. Therefore, bubble sort is not a practical sorting algorithm when n is large.

Among simple average-case Θ(n2) algorithms, selection sort almost always outperforms bubble sort, but is generally outperformed by insertion sort.

selection sort is greatly outperformed on larger arrays by Θ(n log n) divide-and-conquer algorithms such as mergesort. However, insertion sort or selection sort are both typically faster for small arrays.

Likewise, you can yourself select the best sorting algorithm according to your requirements.

COD3BOY
  • 11,964
  • 1
  • 38
  • 56
1

It is proven that O(n log n) is the lower bound for sorting generic items. It is also proven that O(n) is the lower bound for sorting integers (you need at least to read the input :) ).

The specific instance of the problem will determine what is the best algorithm for your needs, ie. sorting 1M strings is different from sorting 2M 7-bits integers in 2MB of RAM.

Also consider that besides the asymptotic runtime complexity, the implementation is making a lot of difference, as well as the amount of available memory and caching policy.

I could implement quicksort in 1 line in python, roughly keeping O(n log n) complexity (with some caveat about the pivot), but Big-Oh notation says nothing about the constant terms, which are relevant too (ie. this is ~30x slower than python built-in sort, which is likely written in C btw):

qsort = lambda a: [] if not a else qsort(filter(lambda x: x<a[len(a)/2], a)) + filter(lambda x: x == a[len(a)/2], a) + qsort(filter(lambda x: x>a[len(a)/2], a))

For a discussion about stable/unstable sorting, look here http://www.developerfusion.com/article/3824/a-guide-to-sorting/6/.

You may want to get yourself a good algorithm book (ie. Cormen, or Skiena).

Savino Sguera
  • 3,522
  • 21
  • 20
1
  • Heapsort, maybe randomized quicksort
  • stable sort
  • as others already mentioned: no there isn't. For example you might want to parallelize your sorting algorithm. This leads to totally different sorting algorithms..
duedl0r
  • 9,289
  • 3
  • 30
  • 45
1

Regarding your question meaning stable, let's consider the following: We have a class of children associated with ages:

Phil, 10
Hans, 10
Eva, 9
Anna, 9
Emil, 8
Jonas, 10

Now, we want to sort the children in order of ascending age (and nothing else). Then, we see that Phil, Hans and Jonas all have age 10, so it is not clear in which order we have to order them since we sort just by age.

Now comes stability: If we sort stable we sort Phil, Hans and Jonas in the order they were before, i.e. we put Phil first, then Hans, and at last, Jonas (simply because they were in this order in the original sequence and we only consider age as comparison criterion). Similarily, we have to put Eva before Anna (both the same age, but in the original sequence Eva was before Anna).

So, the result is:

Emil, 8
Eva, 9
Anna, 9
Phil, 10   \
Hans, 10   | all aged 10, and left in original order.
Jonas, 10  /

To put it in a nutshell: Stability means that if two elements are equal (w.r.t. the chosen sorting criterion), the one coming first in the original sequence still comes first in the resulting sequence.

Note that you can easily transform any sorting algorithm into a stable sorting algorithm: If your original sequence holds n elements: e1, e2, e3, ..., en, you simply attach a counter to each one: (e1, 0), (e2, 1), (e3, 2), ..., (en, n-1). This means you store for each element its original position.

If now two elements are equal, you simply compare their counters and put the one with the lower counter value first. This increases runtime (and memory) by O(n), which is asymptotic no worsening since the best (comparison) sort algorithm needs already O(n lg n).

phimuemue
  • 34,669
  • 9
  • 84
  • 115