0

In this sorting animation, I saw that heap sort and merge sort works best for an array containing random numbers. But, what about if we compare these sorting algorithms with Radix and introsort?

In short, which type of sorting algorithm is best to sort an array consisting of random numbers?

Thanks

  • Your question may be more on-topic at https://cs.stackexchange.com/help – Retired Ninja Sep 23 '18 at 21:48
  • In practice, for fixed-width 32-bit integers, it's likely that a well-implemented radix sort will be fastest for most architectures and array sizes. Asymptotically, it's O(nk) where is key length. It just turns out that for k=32, it's pretty easy to code a radix sort with a run time constant factor that's competitive with the best comparison sorts. But the only way to know for sure is implement and check. nb: on modern architectures, organizing sorts to best exploit the particular machine's memory hierarchy is important. – Gene Sep 24 '18 at 03:21

2 Answers2

1

For an array of random numbers, a least significant digit first counting variation of radix sort is normally fastest for smaller arrays that fit within cache. While for larger arrays, using one most significant digit first to split up the array into smaller sub-arrays that fit in cache will be faster. Since the data is random, the main time overhead for a radix sort is the randomly distributed writes, which is not cache friendly if the array is significantly larger than cache. If the original and working arrays fit within cache, for most systems, the random access writes don't incur a significant time penalty.

There is also a choice for the base used in a radix sort. For example 32 bit numbers can be sorted in 4 passes if using base 256 (8 bit "digits"). Using base 65536 (16 bit "digits") usually exceeds the size of the L1 and/or L2 caches, so it's not faster in most cases, even though it only takes two passes. For 64 bit numbers, four 11 bit "digits" and two 10 bit "digits" could be used to sort in 6 passes, instead of using eight 8 bit "digits" to sort in 8 passes. However, the 11/10 bit digit variation isn't going to be faster unless the array is large enough and the distribution of the random numbers is uniform enough to use up most of the storage used to hold the counts / indexes.

Link to a prior thread about radix sort variations:

Radix Sort Optimization

merge sort "best case"

For a standard merge sort, the number of moves is always the same, but if the data is already sorted, only half the number of compares is done.

quick sort / intro sort "best case"

Best case for quick sort is random data. Using the middle value for partition doesn't make much difference for random data, but if the data is already sorted, it ends up as best case. Intro sort generally involves extra code to check if the recursion is getting too deep where it switches to heap sort. For random data, this shouldn't happen, so the extra code to check for switch to heap sort is just extra overhead.

rcgldr
  • 27,407
  • 3
  • 36
  • 61
0

Here you can see time complexities of various sorting algorithms in best, average and worst cases: http://bigocheatsheet.com/

As you want to compare the time complexities of sorting algorithms with random numbers, we can simply compare their average-case time complexities. You can further look into their algorithms and analyze their time complexities.

https://www.geeksforgeeks.org/merge-sort/

https://www.geeksforgeeks.org/radix-sort/

https://www.geeksforgeeks.org/heap-sort/

https://www.geeksforgeeks.org/know-your-sorting-algorithm-set-2-introsort-cs-sorting-weapon/

Merge Sort is a widely used sorting algorithm in various libraries which has sort function implementations.

Merge sort sorts in O(nlogn) time and O(n) space.

Heap Sort sorts in O(nlogn) time and O(1) space.

Radix sort sorts in O(nk) time and O(n+k) space.

Intro sort sorts in O(nlogn) time and O(logn) space.

Intro Sort is mix of quick, insertion and heap sort. Intro Sort is probably the best one.

There is no perfect algorithm, different algorithms have a different set of advantages and disadvantages.

Note: All time complexities are average case.

ankk98
  • 3
  • 4