Sorting small numbers of elements

Question

I am often finding myself in a situation where I want to sort a small number of elements. By small, I mean 3 or 4. I am probably correct in thinking that with such small problem sets I would want to use some type of explicit or direct method rather than invoking a sort function. 2 is trivial, 3 elements is still pretty simple but above 4 items or so and I'm starting to prefer the simplicity of just running insertion sort.

Up to how many elements can I expect a benefit to coding up a inline void sort_n(int *list)? 4? 5? 6?

In this topic, sorting int array with only 3 elements, there are two solutions for sorting 3 elements provided. One has more comparisons while the other minimizes comparisons but is more complicated. On a modern architecture, which would come out on top for speed?

When you measured, how much faster was your hand-coded sort? — , Dec 24 '11 at 22:28
It hasn't been coded yet. I pretty much know no matter how inefficient my 3- or 4-element custom sort is, it's gonna be faster than I'll ever need it to be. I'm asking because I want to know what I should be doing. It's about the principle of it I guess. — Steven Lu, Dec 24 '11 at 22:37
I would be surprised that you would get any (measurable) advantage over std::sort (especially after all the optimizations that the compiler can do). — Martin York, Dec 24 '11 at 23:04
Remember what asymptotic analysis is actually saying. If your n is very small, don't bother - it does not matter. If you want to know "the best thing to do in a given situation", I'm very confident it is to spend valuable programming time elsewhere. — Juho, Dec 24 '11 at 23:09

Has QUIT--Anony-Mousse · Answer 1 · 2011-12-24T23:29:42.990

7

Did you profile your application and this turn out to be a bottleneck? Or are you trying to overoptimize?

Bubblesort can work very well, too. In particular, when your data is already sorted, it actually is optimal and will beat any textbook merge or heap sort. Unless you give the full constraints (cost of swapping elements, stability requirements, in-place or not...) nobody can answer this fully.

Anyway, for four elements it is rather obvious how to implement an efficient merge sort inline.

For odd (non-power of 2) sizes I belive that backwards insertion sort is a common optimization. Have a look at Javas sorting implementation. I believe it has such a small array optimization already. Did you check that the sort routine you would have called doesn't already do this kind of optimizations?

edited Dec 24 '11 at 23:29

answered Dec 24 '11 at 22:32

Has QUIT--Anony-Mousse

76,138
12
138
194

I do believe insertion sort is superior to bubble sort for already sorted data. Your point about the relative merits of different "naive" algorithms at low scale still stands. – phs Dec 24 '11 at 22:35
Both need just `O(n-1)` comparisons and no changes to the data, they do exactly the same except maybe for the direction they traverse the data. *Why* should insertion sort be superior? – Has QUIT--Anony-Mousse Dec 24 '11 at 22:41
Well just don't forget that any intelligent heapsort implementation will have a low cutoff threshold after which it uses insertion sort, so that basically limits any advantages we can hope to gain there. If the size is known at compile time we certainly can beat the default solution - but then that code is better called millions of times per second, otherwise it's kinda pointless to worry about a handful cycles. – Voo Dec 25 '11 at 00:41

score 6 · Accepted Answer · edited May 23 '17 at 10:29

6

Have a look at sorting networks.

A few links:
http://en.wikipedia.org/wiki/Sorting_network
http://www.cs.uky.edu/~lewis/essays/algorithms/sortnets/sort-net.html
Fastest sort of fixed length 6 int array

edited May 23 '17 at 10:29

Community

1
1

answered Dec 25 '11 at 00:27

pmg

106,608
13
126
198

score 5 · Answer 3 · answered Dec 24 '11 at 23:16

All answers to optimization questions must be prefaced by the warning that you should profile and only optimize bottlenecks, so: be sure to do that.

In the spirit of C/C++ I will now trust you to have done so and answer the question you have asked.

You determine the answer to your question by iteratively profiling.

Write template <int N> inline void sort_n(int * list) whose default implementation uses an std lib sort. Use this template whenever appropriate in your code. Then write a template specializations for the smallest N case for which you do not already have a specialization. After writing this specialization, profile your program and see if you made a significant performance gain. If you make a performance gain which you judge significant, repeat. Once you write a specialization and don't get much out of it, stop.

score 2 · Answer 4 · answered Dec 24 '11 at 22:43

For any reasonable number of items, you will always get a performance benefit from coding up the comparisons explicitly. However, sorting so few items takes so little time that it rarely matters what method you use anyway.

The threshold where you won't get any performance benefit is when you get so much code that it won't fit in the CPU cache any more, so where the threshold is will differ depending on which CPU you are running it on.

You should also consider how you would test such code. The more code you have, the harder it will be to verify that it's bug free.

score 1 · Answer 5 · answered Dec 24 '11 at 22:27

1

You'll need to do some profiling of your application first to determine if doing this optimization is worth your time. I suspect it will not be. The standard library (or boost) functions will almost certainly be your best bet for sorting.

answered Dec 24 '11 at 22:27

rkb

3,442
19
25

You are of course correct and I am likely prematurely optimizing. However I am always looking to know the best thing to do in a given situation. I suppose if nobody has found it necessary to go down this path to determine the difference, then knowing the right answer may be practically useless. But I still want to know the answer. – Steven Lu Dec 24 '11 at 22:28
A custom sort makes your code more difficult to debug, maintain, understand, or port to another platform. As an exercise, it can be interesting, and a hand-rolled sort will often be faster for specific narrow cases--but you will spend a disproportionate amount of time coding it correctly. – rkb Dec 24 '11 at 23:00

score 1 · Answer 6 · answered Dec 24 '11 at 22:48

Insertion sort and bubble sort are generally used for small data.

I believe insertion sort is preferred, citing these 2 wikipedia texts:

quicksort implementations use insertion sort for arrays smaller than a certain threshold

And:

Bubble sort also interacts poorly with modern CPU hardware. It requires at least twice as many writes as insertion sort, twice as many cache misses, and asymptotically more branch mispredictions. Experiments by Astrachan sorting strings in Java show bubble sort to be roughly 5 times slower than insertion sort and 40% slower than selection sort.

As always, you should profile if you're really concerned with the speed.

score 0 · Answer 7 · answered Dec 24 '11 at 22:36

0

I'm told that the standard library sorts have optimized cases for small n - I've never tried to verify it.

answered Dec 24 '11 at 22:36

Mark Ransom

299,747
42
398
622

That would be reasonable, at least. When you're writing a library, this sort of optimisation makes sense. I know it's done in a Haskell library, [vector-algorithms](http://hackage.haskell.org/packages/archive/vector-algorithms/0.5.3/doc/html/Data-Vector-Algorithms-Optimal.html), wouldn't be surprised if `qsort` or `std::sort` did it too. – Daniel Fischer Dec 24 '11 at 23:31
1

Yes, see https://stackoverflow.com/questions/14212701/stdsort-does-not-always-call-stdswap – user2672165 May 11 '20 at 18:13

Sorting small numbers of elements

7 Answers7