4

I need to optimize some code that sorts a vector<pair<int, float >>a where the pairs needs to be sorted on the float value. The vector will have an length between 0 and 5. I've been googling and reading up on sorting methods in C++ but cannot find any benchmarks on sorting tiny data sets. For the system it's important be as fast as possible as it's used for a real time blob tracking system.

Kind regards, Pollux

pollux
  • 779
  • 2
  • 14
  • 30
  • Radix sort can only be used for integers. – kennytm Jun 04 '10 at 07:12
  • 2
    How do you create those vectors? Can't you _create_ them sorted? – sbi Jun 04 '10 at 07:31
  • 2
    IEEE754 floats have an interesting property that they are in the same numerical order when treated like integers, because of the sign bit and biased exponent. – matja Jun 04 '10 at 07:44
  • 1
    @matja: as long as you exclude `NaN`. – MSalters Jun 04 '10 at 08:15
  • If you are doing the blob/background partition on the same processor that is doing this sort, then fuggedaboutit! The 4- or 8-way blob connectivity algorithm on the input image will completely swamp the sort time. Consider: on a 64x64 sensor, you have to look at 4096 pixel values and make a blob/background decision. By comparison, to O(N^2) bubble-sort 5 values, you have to make roughly 20 comparisons. Compare 20 to 4096, and tell me where you're going to hurt worse for cycles. – John R. Strohm Jun 04 '10 at 17:19

8 Answers8

6

Insertion sort and Bubble sort are great for tiny data pairs.

Another option is to hard code the compare logic, with a couple if statements.
Check out What is the fastest possible way to sort an array of 7 integers? for some ideas.

Community
  • 1
  • 1
Nick Dandoulakis
  • 42,588
  • 16
  • 104
  • 136
  • Doesn't Insertion sort always beat Bubble sort? – Nikko Jun 04 '10 at 09:04
  • @Nikko, yeah insertion would be a better choice in general. However, if you implement bubble sort inner loop *properly* (optimized), I don't think it'll be any performance difference on such small data sets. – Nick Dandoulakis Jun 04 '10 at 09:25
5

It makes no sense to read about benchmarks. You can read about and compare the complexity (Big-O) of algorithms, because it only depends on the algorithms themselves, but benchmarking is something that depends on too many factors.

You need to do the benchmarking for yourself, using the tools that you use, in the environment that matters to the users of your application.

sbi
  • 219,715
  • 46
  • 258
  • 445
3

For the data you have mentioned (0-5), use STL sort methods. ( historically qsort based)

stable_sort -- if you want to maintain the order for duplicates. ( historically merge sort based)

aJ.
  • 34,624
  • 22
  • 86
  • 128
3

Any particular reason why you need this code optimized? For n==5, pretty much any sort will do. Even Bogosort should have a reasonable runtime, since there are only 5! == 120 possible permutations in your data. Have you profiled your algorithm to find out whether this is a bottleneck?

Chinmay Kanchi
  • 62,729
  • 22
  • 87
  • 114
2

First, premature optimization is the root of all evil. That is, first benchmark your code and make sure the sorting is the one that's actually taking the most time. If another part of your performance-critical code is taking 80% of the execution time, you will get drastic performance improvements optimizing that first.

Considering you have 5 elements, the point is pretty much moot. Any algorithm you use (except bogosort :D) should have a pretty much constant execution time, unless you run the algorithm a few hundred times per second, or more.

Second, provided you still want to optimize the search, if you know for sure you always will have 5 elements, the optimal method is by hard-coding the comparations (It can be proven mathematically that this method is optimal, providing a minimal number of executed operations in the worst case scenario - we had this as a laboratory application in university). The same applies if you have less than five elements, but the algorithm becomes prohibitive to write and execute if you have more than seven elements - the logic is convoluted to write and follow so your code will become difficult to maintain.

Let me know if you need help with writing the optimal hard-coded algorithm itself (though I think you should find the implementation and theory online).

If you do not always have five numbers (or for some reason want to avoid the hardcoded comparisions algorithm), use the sort provided by the library. This page concludes that the stl sort is optimal in terms of time (not only number of executed operations). I do not know what implementation was used for search.

utnapistim
  • 26,809
  • 3
  • 46
  • 82
  • Small complaint: "constant execution time" doesn't mean fast, it just means constant, so it's not necessarily a good thing if it is "pretty much constant", and it won't become less constant if you repeat it a few hundred times. – Thomas Padron-McCarthy Jun 04 '10 at 09:03
  • 1
    I wouldn't trust that page that shows STL sort is optimal. For one thing, I couldn't find the source for all of the sorts used. There is a file `d_sort.h` missing that I guess comes with that book there. Second, I have a single-pivot, recursive quicksort that is significantly faster than std::sort. Plus, there is a dual-pivot quicksort (not of my own creation) that beats my quicksort and std::sort. That said, std::sort is fairly fast, but it's good to be wary of studies and scrutinize them to see how they were done. – Justin Peel Jun 04 '10 at 17:27
2

Use a somewhat nasty set of ifs for the fastest sort of 5 items, or if you want a sort that is just a little bit slower, but much less of a headache, you can use a sorting network. Use this site with the number of inputs equal to five to get an optimized sorting network. It even shows how you can do some parts in parallel though that seems a bit excessive for only 5 inputs. It will require 9 comparisons which is quite good. You will implement the sorting network with a set of ifs as well, but the difference between the somewhat nasty set of ifs, which I mentioned at the beginning, which truly optimal and an optimal sorting network is the number of swaps: the sorting network doesn't minimize the number of swaps.

Justin Peel
  • 46,722
  • 6
  • 58
  • 80
1

If you're really certain (that is, you have measured) that this is a bottleneck and needs to be optimized, and just any sort algorithm from STL won't be fast enough (and you've measured that too), then perhaps you know something more that you can use? Standard sorting algorithms are general, and will work (reasonably well) for any data set: different numbers of elements, different ranges of values, and so on. If you really need performance, the trick is often to use additional information to make a more specialized algorithm.

Here you have said one thing: there are 0-5 elements to sort, As Nick D said in his answer, perhaps this will make it faster to use hard-coded if statements instead of the typical loops or recursion of general sorting algorithms.

But perhaps there is more? For example, are there any restrictions on the float values that can occur?

Thomas Padron-McCarthy
  • 27,232
  • 8
  • 51
  • 75
0

Here is a routine for sorting 4 elements in optimal number of comparisons. I would have posted for 5 elements, but stackoverflow limits posts to 30000 characters.

Whether this is actually the fastest depends on how much pressure your CPU instruction cache can take, so benchmark within real program!

/// Generated sort function for 4 arguments.
template <class T>
void DirectSort(T& a0, T& a1, T& a2, T& a3) {
    if (a0 < a1) {
        if (a0 < a2) {
            if (a0 < a3) {
                if (a1 < a2) {
                    if (a1 < a3) {
                        if (a3 < a2) {
                            {
                                T tmp(a2);
                                a2 = a3;
                                a3 = tmp;
                            }
                        }
                    }
                    else { // a1 >= a3
                        {
                            T tmp(a1);
                            a1 = a3;
                            a3 = a2;
                            a2 = tmp;
                        }
                        // a2 >= a3
                    }
                }
                else { // a1 >= a2
                    if (a1 < a3) {
                        {
                            T tmp(a1);
                            a1 = a2;
                            a2 = tmp;
                        }
                        // a2 < a3
                    }
                    else { // a1 >= a3
                        if (a2 < a3) {
                            {
                                T tmp(a1);
                                a1 = a2;
                                a2 = a3;
                                a3 = tmp;
                            }
                        }
                        else { // a2 >= a3
                            {
                                T tmp(a1);
                                a1 = a3;
                                a3 = tmp;
                            }
                        }
                    }
                }
            }
            else { // a0 >= a3
                if (a1 < a2) {
                    {
                        T tmp(a0);
                        a0 = a3;
                        a3 = a2;
                        a2 = a1;
                        a1 = tmp;
                    }
                    // a1 >= a3
                    // a2 >= a3
                }
                else { // a1 >= a2
                    {
                        T tmp(a0);
                        a0 = a3;
                        a3 = a1;
                        a1 = tmp;
                    }
                    // a1 >= a3
                    // a2 >= a3
                }
            }
        }
        else { // a0 >= a2
            if (a0 < a3) {
                // a1 >= a2
                if (a1 < a3) {
                    {
                        T tmp(a0);
                        a0 = a2;
                        a2 = a1;
                        a1 = tmp;
                    }
                    // a2 < a3
                }
                else { // a1 >= a3
                    {
                        T tmp(a0);
                        a0 = a2;
                        a2 = a3;
                        a3 = a1;
                        a1 = tmp;
                    }
                    // a2 < a3
                }
            }
            else { // a0 >= a3
                // a1 >= a2
                // a1 >= a3
                if (a2 < a3) {
                    {
                        T tmp(a0);
                        a0 = a2;
                        a2 = tmp;
                    }
                    {
                        T tmp(a1);
                        a1 = a3;
                        a3 = tmp;
                    }
                }
                else { // a2 >= a3
                    {
                        T tmp(a0);
                        a0 = a3;
                        a3 = a1;
                        a1 = a2;
                        a2 = tmp;
                    }
                }
            }
        }
    }
    else { // a0 >= a1
        if (a0 < a2) {
            if (a0 < a3) {
                {
                    T tmp(a0);
                    a0 = a1;
                    a1 = tmp;
                }
                // a1 < a2
                // a1 < a3
                if (a3 < a2) {
                    {
                        T tmp(a2);
                        a2 = a3;
                        a3 = tmp;
                    }
                }
            }
            else { // a0 >= a3
                // a1 < a2
                if (a1 < a3) {
                    {
                        T tmp(a0);
                        a0 = a1;
                        a1 = a3;
                        a3 = a2;
                        a2 = tmp;
                    }
                    // a2 >= a3
                }
                else { // a1 >= a3
                    {
                        T tmp(a0);
                        a0 = a3;
                        a3 = a2;
                        a2 = tmp;
                    }
                    // a2 >= a3
                }
            }
        }
        else { // a0 >= a2
            if (a0 < a3) {
                if (a1 < a2) {
                    {
                        T tmp(a0);
                        a0 = a1;
                        a1 = a2;
                        a2 = tmp;
                    }
                    // a1 < a3
                    // a2 < a3
                }
                else { // a1 >= a2
                    {
                        T tmp(a0);
                        a0 = a2;
                        a2 = tmp;
                    }
                    // a1 < a3
                    // a2 < a3
                }
            }
            else { // a0 >= a3
                if (a1 < a2) {
                    if (a1 < a3) {
                        if (a2 < a3) {
                            {
                                T tmp(a0);
                                a0 = a1;
                                a1 = a2;
                                a2 = a3;
                                a3 = tmp;
                            }
                        }
                        else { // a2 >= a3
                            {
                                T tmp(a0);
                                a0 = a1;
                                a1 = a3;
                                a3 = tmp;
                            }
                        }
                    }
                    else { // a1 >= a3
                        {
                            T tmp(a0);
                            a0 = a3;
                            a3 = tmp;
                        }
                        // a2 >= a3
                    }
                }
                else { // a1 >= a2
                    if (a1 < a3) {
                        {
                            T tmp(a0);
                            a0 = a2;
                            a2 = a3;
                            a3 = tmp;
                        }
                        // a2 < a3
                    }
                    else { // a1 >= a3
                        if (a2 < a3) {
                            {
                                T tmp(a0);
                                a0 = a2;
                                a2 = a1;
                                a1 = a3;
                                a3 = tmp;
                            }
                        }
                        else { // a2 >= a3
                            {
                                T tmp(a0);
                                a0 = a3;
                                a3 = tmp;
                            }
                            {
                                T tmp(a1);
                                a1 = a2;
                                a2 = tmp;
                            }
                        }
                    }
                }
            }
        }
    }
} // DirectSort - 4 arguments.
Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167