63

The last week I stumbled over this paper where the authors mention on the second page:

Note that this yields a linear running time for integer edge weights.

The same on the third page:

This yields a linear running time for integer edge weights and O(m log n) for comparison-based sorting.

And on the 8th page:

In particular, using fast integer sorting would probably accelerate GPA considerably.

Does this mean that there is a O(n) sorting algorithm under special circumstances for integer values? Or is this a specialty of graph theory?

PS:
It could be that reference [3] could be helpful because on the first page they say:

Further improvements have been achieved for [..] graph classes such as integer edge weights [3], [...]

but I didn't have access to any of the scientific journals.

polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
Karussell
  • 17,085
  • 16
  • 97
  • 197
  • 2
    To see why special circumstances can help, consider the case of sorting a million integers between 0 and 9. You can simply count how many of each digit there are, and afterward simply put the digits in the right order based on their counts. This is the basis of counting sort. – polygenelubricants Feb 28 '10 at 19:57
  • thanks to all of you! I learned a lot. See here for some Java benchmarks I made up on this question: http://karussell.wordpress.com/2010/03/01/fast-integer-sorting-algorithm-on/ – Karussell Mar 02 '10 at 08:33
  • I made one of these as a joke (http://tinylittlelife.org/?p=261). To spoil the punchline, it accomplishes this by treating the input as an array of bits instead of bytes and "sorting" it into the form `000000111111`. – Ian Jun 14 '17 at 15:15

7 Answers7

100

Yes, Radix Sort and Counting Sort are O(N). They are NOT comparison-based sorts, which have been proven to have Ω(N log N) lower bound.

To be precise, Radix Sort is O(kN), where k is the number of digits in the values to be sorted. Counting Sort is O(N + k), where k is the range of the numbers to be sorted.

There are specific applications where k is small enough that both Radix Sort and Counting Sort exhibit linear-time performance in practice.

Stefan Zobel
  • 3,182
  • 7
  • 28
  • 38
polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
  • 34
    Lower bounds are always expressed as Ω. Saying an O lower bound has no semantic meaning. Otherwise +1. – Billy ONeal Feb 28 '10 at 19:33
  • 2
    They're only `O(n)` if the biggest possible value of the integers is less than or equal to n - otherwise they're `O(max_int)`, no? – sepp2k Feb 28 '10 at 19:38
  • 1
    @sepp2k I added the `k` in there for clarification. – polygenelubricants Feb 28 '10 at 19:45
  • 1
    `O(kN)` where k is the number of digits is a bit misleading. k doesn't have to be the number of digits, it can be the number of **bytes**. For 32 bit integers, using k = 4 or even k = 2 will make radix sort very fast; much faster than using k = 10 and sorting by each digit. This makes the memory used **O(2^number_of_bits_in_a_byte)**. – IVlad Feb 28 '10 at 20:05
  • 4
    That's just semantics. Digits doesn't have to be base 10. I can set it to be base 256, and that's essentially the same as your argument. – polygenelubricants Feb 28 '10 at 20:09
  • Yes, semantics, which is why I said a bit misleading and not wrong :). – IVlad Feb 28 '10 at 20:13
  • 1
    To be precise, the number of digits is going to be proportional to the size of the numbers, and assuming you aren't just sorting endless duplicates the size of the numbers is proportional to the log of the number of numbers, so radix sort is O(n log n). It's just that the log n is not a factor as it's normally used. – David Thornley Mar 01 '10 at 16:23
  • 6
    @David "Radix sort's efficiency is thus O(kn) for n keys of k digits. Since k is normally on the order of log n, it might appear that radix sort does not really beat the O(n log n) time of the best comparison sorts. *However, the O(n log n) time of the best comparison sorts* **counts the number of comparisons**, and the fastest time possible for a comparison is k. If we count the number of primitive operations, then merge sort (or other fast comparison sorts) execute in O(kn log n) time." – polygenelubricants Mar 01 '10 at 16:38
  • 1
    @polygenelubricants comparison of integers does not take k time if hardware is wide enough. – Pete Kirkham Mar 01 '10 at 18:49
  • I'm not sure how often people sort `BigInteger` in practice, but nonetheless that remark (which I took verbatim from current version of Wikipedia article) is the standard counterargument to claims that radix sort is not linear. – polygenelubricants Mar 01 '10 at 20:19
  • 4
    @polygenelubricants "Since k is normally on the order of log n" Why is this? – HischT Sep 28 '12 at 19:02
  • @HischT super late, but if you have a list of the numbers from 1 - 1000000 you will find that log(1000000) is 6 - the number of digits in 90% of the dataset. – Daniel Vestøl Oct 20 '21 at 13:48
19

Comparison sorts must be at least Ω(n log n) on average.

However, counting sort and radix sort scale linearly with input size – because they are not comparison sorts, they exploit the fixed structure of the inputs.

Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
ephemient
  • 198,619
  • 38
  • 280
  • 391
6

Counting sort: http://en.wikipedia.org/wiki/Counting_sort if your integers are fairly small. Radix sort if you have bigger numbers (this is basically a generalization of counting sort, or an optimization for bigger numbers if you will): http://en.wikipedia.org/wiki/Radix_sort

There is also bucket sort: http://en.wikipedia.org/wiki/Bucket_sort

IVlad
  • 43,099
  • 13
  • 111
  • 179
2

While not very practical (mainly due to the large memory overhead), I thought I would mention Abacus (Bead) Sort as another interesting linear time sorting algorithm.

Dolphin
  • 4,655
  • 1
  • 30
  • 25
  • 4
    The big-O may be attractive, but when implemented in software, [this sort](http://stackoverflow.com/a/9027807/10396) performs about 10x slower than quicksort. – AShelly Mar 22 '12 at 20:50
2

These hardware-based sorting algorithms:

A Comparison-Free Sorting Algorithm
Sorting Binary Numbers in Hardware - A Novel Algorithm and its Implementation

Laser Domino Sorting Algorithm - a thought experiment by me based on Counting Sort with an intention to achieve O(n) time complexity over Counting Sort's O(n + k).

abcoep
  • 577
  • 9
  • 14
0

Adding a little more detail - Practically the best sorting algorithm till date is not O(n), but O(n √(log log n)) expected time.

You can check more details about this algorithm in Yijie Han & Mikkel Thorup's FOCS '02 paper.

greybeard
  • 2,249
  • 8
  • 30
  • 66
akuriako
  • 429
  • 5
  • 7
0

the accepted answer is not a valid O(n) sort as far as it is a O(n+k) sorting algorithm.

You should have a look at the Stalin Sort

Deblaton Jean-Philippe
  • 11,188
  • 3
  • 49
  • 66