52

For general-purpose sorting, the answer appears to be no, as quick sort, merge sort and heap sort tend to perform better in the average- and worst-case scenarios. However, insertion sort appears to excel at incremental sorting, that is, adding elements to a list one at a time over an extended period of time while keeping the list sorted, especially if the insertion sort is implemented as a linked list (O(log n) average case vs. O(n)). However, a heap seems to be able to perform just (or nearly) as well for incremental sorting (adding or removing a single element from a heap has a worst-case scenario of O(log n)). So what exactly does insertion sort have to offer over other comparison-based sorting algorithms or heaps?

  • 4
    If you are loading in a large amount of data from a relatively slower external source such as a hard drive, it's often better to use a sort-as-you-go algorithm to make use of the wasted cycles involved in a CPU waiting for the drive to catch up. [See my answer below](http://stackoverflow.com/a/30193315/4229245). –  May 12 '15 at 14:32

7 Answers7

68

From http://www.sorting-algorithms.com/insertion-sort:

Although it is one of the elementary sorting algorithms with O(n2) worst-case time, insertion sort is the algorithm of choice either when the data is nearly sorted (because it is adaptive) or when the problem size is small (because it has low overhead).

For these reasons, and because it is also stable, insertion sort is often used as the recursive base case (when the problem size is small) for higher overhead divide-and-conquer sorting algorithms, such as merge sort or quick sort.

Community
  • 1
  • 1
guns
  • 10,550
  • 3
  • 39
  • 36
  • 6
    Ah, I forgot about stability... None of the other algorithms I mentioned is stable. –  Apr 10 '09 at 07:06
  • 10
    +1. Insertion sort's inner loop just happens to be a good fit for modern CPUs and caches -- it's a very tight loop that accesses memory in increasing order only. – j_random_hacker Apr 10 '09 at 07:17
  • 1
    Well, quicksort can be implemented as a stable sort, but since it's optimal for randomized sets, I think the efficient qsort functions randomize the data deliberately before sorting. – guns Apr 10 '09 at 07:19
  • 11
    Insertion sort is also good because it's useful in online situation, when you get one element at a time. – sykora Apr 10 '09 at 07:36
  • 1
    If someone wants to know what stability here means, checkout https://stackoverflow.com/questions/1517793/what-is-stability-in-sorting-algorithms-and-why-is-it-important. – Uthman Nov 04 '17 at 22:42
19

An important concept in analysis of algorithms is asymptotic analysis. In the case of two algorithms with different asymptotic running times, such as one O(n^2) and one O(nlogn) as is the case with insertion sort and quicksort respectively, it is not definite that one is faster than the other.

The important distinction with this sort of analysis is that for sufficiently large N, one algorithm will be faster than another. When analyzing an algorithm down to a term like O(nlogn), you drop constants. When realistically analyzing the running of an algorithm, those constants will be important only for situations of small n.

So what does this mean? That means for certain small n, some algorithms are faster. This article from EmbeddedGurus.net includes an interesting perspective on choosing different sorting algorithms in the case of a limited space (16k) and limited memory system. Of course, the article references only sorting a list of 20 integers, so larger orders of n is irrelevant. Shorter code and less memory consumption (as well as avoiding recursion) were ultimately more important decisions.

Insertion sort has low overhead, it can be written fairly succinctly, and it has several two key benefits: it is stable, and it has a fairly fast running case when the input is nearly sorted.

AShelly
  • 34,686
  • 15
  • 91
  • 152
Anthony
  • 7,086
  • 1
  • 21
  • 22
18

Yes, there is a reason to use either an insertion sort or one of its variants.

The sorting alternatives (quick sort, etc.) of the other answers here make the assumption that the data is already in memory and ready to go.

But if you are attempting to read in a large amount of data from a slower external source (say a hard drive), there is a large amount of time wasted as the bottleneck is clearly the data channel or the drive itself. It just cannot keep up with the CPU. A natural series of waits occur during any read. These waits are wasted CPU cycles unless you use them to sort as you go.

For instance, if you were to make your solution to this be the following:

  1. Read a ton of data in a dedicated loop into memory
  2. Sort that data

You would very likely take longer than if you did the following in two threads.

Thread A:

  1. Read a datum
  2. Place datum into FIFO queue
  3. (Repeat until data exhausted from drive)

Thread B:

  1. Get a datum from the FIFO queue
  2. Insert it into the proper place in your sorted list
  3. (repeat until queue empty AND Thread A says "done").

...the above will allow you to use the otherwise wasted time. Note: Thread B does not impede Thread A's progress.

By the time the data is fully read, it will have been sorted and ready for use.

4

Most sorting procedures will use quicksort and then insertion sort for very small data sets.

BobbyShaftoe
  • 28,337
  • 7
  • 52
  • 74
2

If you're talking about maintaining a sorted list, there is no advantage over some kind of tree, it's just slower.

Well, maybe it consumes less memory or is a simpler implementation.

Inserting into a sorted list will involve a scan, which means that each insert is O(n), therefore sorting n items becomes O(n^2)

Inserting into a container such as a balanced tree, is typically log(n), therefore the sort is O(n log(n)) which is of course better.

But for small lists it hardly makes any difference. You might use an insert sort if you have to write it yourself without any libraries, the lists are small and/or you don't care about performance.

MarkR
  • 62,604
  • 14
  • 116
  • 151
1

YES,

Insertion sort is better than Quick Sort on short lists.

In fact an optimal Quick Sort has a size threshold that it stops at, and then the entire array is sorted by insertion sort over the threshold limits.

Also...

For maintaining a scoreboard, Binary Insertion Sort may be as good as it gets.

See this page.

Jason Plank
  • 2,336
  • 5
  • 31
  • 40
JohnPaul
  • 39
  • 1
  • The "scoreboard" notion, where items are made available one at a time, reminds me of a "dual" of that situation, where items need to be returned from the sort one at a time (as with selection sort). I've coded an NlgN sort which returns the first element first, the second element second, etc. Bookkeeping overhead is pretty horrendous, but the number of comparisons is smaller than the library qsort() against which I benchmarked it. Start with all nodes in the primary pool with a score of one. Repeatedly take two items with the lowest score from the primary pool and compare them... – supercat May 04 '12 at 16:39
  • ...placing the "winner" back in the primary score, with the loser's score added to its own, and the loser in a "reserve" pool with its score unmodified. Keep going until the primary pool has one element. That element is the best, so output it, and move to the primary pool all elements against which the winning element had been compared. Then start taking items from the primary pool as before until there's only one left (the second-best item). At any given time, every item in the reserve pool will be inferior to at least one item in the primary pool, and no item in the primary pool... – supercat May 04 '12 at 16:44
  • ...will be known to be inferior to anything else in either pool. Although the primary pool will start with all N items in it, later passes only include items against which the "winner" had been compared, so outputting items after the first one will be reasonably fast. – supercat May 04 '12 at 16:47
0

For small array size insertion sort outperforms quicksort. Java 7 and Java 8 uses dual pivot quicksort to sort primitive data types. Dual pivot quicksort out performs typical single pivot quicksort. According to algorithm of dual pivot quicksort :

  1. For small arrays (length < 27), use the Insertion sort algorithm.
  2. Choose two pivot...........

Definitely, insertion sort out performs quicksort for smaller array size and that is why you switch to insertion sort for arrays of length less than 27. The reason could be: there are no recursions in insertion sort.

Source: http://codeblab.com/wp-content/uploads/2009/09/DualPivotQuicksort.pdf

Community
  • 1
  • 1
Kangkan Lahkar
  • 267
  • 5
  • 16