25

Consider the following code:

#include <algorithm>
#include <iostream>
#include <vector>

namespace my_space
{

struct A
{
    double  a;
    double* b;
    bool operator<(const A& rhs) const
    {
        return this->a < rhs.a;
    }
};

void swap(A& lhs, A& rhs)
{
    std::cerr << "My swap.\n";
    std::swap(lhs.a, rhs.a);
    std::swap(lhs.b, rhs.b);
}

}

int main()
{
    const int n = 20;
    std::vector<my_space::A> vec(n);
    for (int i = 0; i < n; ++i) {
        vec[i].a = -i;
    }

    for (int i = 0; i < n; ++i) {
        std::cerr << vec[i].a << " ";
    }
    std::cerr << "\n";
    std::sort(vec.begin(), vec.end());
    for (int i = 0; i < n; ++i) {
        std::cerr << vec[i].a << " ";
    }
    std::cerr << "\n";
}

If I use n=20, the custom swap function is called and the array is sorted. But if I use n=4, the array is sorted correctly, but the custom swap function is not called. Why is that? What if it is really expensive to copy my objects?

For this test, I was using gcc 4.5.3.

Petter
  • 37,121
  • 7
  • 47
  • 62
  • 1
    [This question](http://stackoverflow.com/questions/6380862/how-to-provide-a-swap-function-for-my-class) should be helpful. – chris Jan 08 '13 at 10:18
  • 1
    BTW, why use `std::cerr` for normal output? – Nawaz Jan 08 '13 at 10:20
  • 2
    I don't use `cerr` for normal output. I use `cout` for normal output and `cerr` for error messages, diagnostics and debug. In this case I suppose I could have used `cout`. – Petter Jan 08 '13 at 10:23
  • 1
    +1 this question lead me to quite a bit of digging even though I knew the general outline of the reason. – Konrad Rudolph Jan 08 '13 at 11:03

3 Answers3

26

For small ranges, std::sort implementations in GCC’s stdlibc++ (and other standard library implementations) recurs to insertion sort for performance reasons (it’s faster than quicksort / introsort on small ranges).

GCC’s insertion sort implementation in turn doesn’t swap via std::swap – instead, it moves whole ranges of values at a time, instead of swapping individually, thus potentially saving performance. The relevant part is here (bits/stl_algo.h:2187, GCC 4.7.2):

typename iterator_traits<_RandomAccessIterator>::value_type
  __val = _GLIBCXX_MOVE(*__i);
_GLIBCXX_MOVE_BACKWARD3(__first, __i, __i + 1);
*__first = _GLIBCXX_MOVE(__val);

_GLIBCXX_MOVE is the same as std::move from C++11 and _GLIBCXX_MOVE_BACKWARD3 is std::move_backward – however, this is only the case if __GXX_EXPERIMENTAL_CXX0X__ is defined; if not, then these operations resort to copying instead of moving!

What this does is move the value at the current position (__i) to a temporary storage, then move all previous values from __first to __i one up, and then re-insert the temporary value at __first. So this performs n swaps in one operation instead having to move n values to a temporary location:

  first           i
+---+---+---+---+---+---+
| b | c | d | e | a | f |
+---+---+---+---+---+---+
                  |
  <---------------+


  first           i
+---+---+---+---+---+---+
| --> b-> c-> d-> e-> f |
+---+---+---+---+---+---+


  first           i
+---+---+---+---+---+---+
| a | b | c | d | e | f |
+---+---+---+---+---+---+
  ^
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 4
    So if move is much more expensive than swap for your type then GCC's strategy is a pessimization. But that's "your own fault" for optimizing swap but not optimizing move, right? – Steve Jessop Jan 08 '13 at 11:04
  • @SteveJessop Yes, and yes. – Konrad Rudolph Jan 08 '13 at 11:05
  • Thank you! And thank you, Steve, for a good question in the comment! – Petter Jan 08 '13 at 12:34
  • 1
    Just to clarify for myself (and for others). "Optimizing move" means writing a move constructor, right? – Petter Jan 08 '13 at 12:36
  • I think that in pure C++03 (no C++11) you have to write your own swap-only sort. Or maybe place the heavy load behind smart pointer. – Notinlist Jan 08 '13 at 14:29
  • 1
    @Ben you really want a move assignment as well. – Marc Glisse Jan 08 '13 at 17:37
  • @Marc, yes, I guess those two always come together. – Petter Jan 09 '13 at 09:29
  • @MarcGlisse: Not if you use [the copy-and-swap idiom](https://stackoverflow.com/q/3279543/364696) to avoid code duplication (with copy-and-swap, you only need one `operator=` overload, and it takes its argument by value, not by reference, r-value or l-value; by taking its argument by value, it delegates to the appropriate constructor, be it copy or move). – ShadowRanger Apr 26 '19 at 00:58
4

Depending on the type, swapping can be more expensive than a move-assignment (in C++98 a simple assignment). The standard library doesn't have any way to detect these cases. At least in C++11 the solution is clear: implement a move-assignment operator for all classes where you implement swap.

Marc Glisse
  • 7,550
  • 2
  • 30
  • 53
1

I modified the code to be more verbose. The sorting for 20 elements uses many swaps, uses assignment end copy. Sorting for 4 elements uses only assignment and copy. Don't know about the specs, but it might be something to go on.

#include <algorithm>
#include <iostream>
#include <vector>

namespace my_space
{

struct A
{
    double  a;
    double* b;
    A()
        : a(0)
        , b(NULL)
    { }
    A(const A &rhs)
        : a(rhs.a)
        , b(rhs.b)
    {
        std::cerr << "copy" << std::endl;
    }
    A& operator=(A const &rhs)
    {
        if(this==&rhs)
                return *this;
        a = rhs.a;
        b = rhs.b;
        std::cerr << "=" << std::endl;
        return *this;
    }
    bool operator<(const A& rhs) const
    {
        return this->a < rhs.a;
    }
};

void swap(A& lhs, A& rhs)
{
    std::cerr << "My swap.\n";
    std::swap(lhs.a, rhs.a);
    std::swap(lhs.b, rhs.b);
}

} // namespace my_space

int main()
{
    const int n = 20;

        std::cerr << "=== TEST CASE: n = " << n << std::endl;
    std::cerr << "=== FILL ===" << std::endl;
    std::vector<my_space::A> vec(n);
    for (int i = 0; i < n; ++i) {
        vec[i].a = -i;
    }

    std::cerr << "=== PRINT ===" << std::endl;
    for (int i = 0; i < n; ++i) {
        std::cerr << vec[i].a << " ";
    }
    std::cerr << "\n";

    std::cerr << "=== SORT ===" << std::endl;
    std::sort(vec.begin(), vec.end());

    std::cerr << "=== PRINT ===" << std::endl;
    for (int i = 0; i < n; ++i) {
        std::cerr << vec[i].a << " ";
    }
    std::cerr << "\n";
}

Outputs

=== TEST CASE: n = 4
=== FILL ===
copy
copy
copy
copy
=== PRINT ===
0 -1 -2 -3
=== SORT ===
copy
=
=
copy
=
=
=
copy
=
=
=
=
=== PRINT ===
-3 -2 -1 0

And

=== TEST CASE: n = 20
=== FILL ===
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
copy
=== PRINT ===
0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -16 -17 -18 -19
=== SORT ===
copy
My swap.
My swap.
My swap.
My swap.
My swap.
My swap.
My swap.
My swap.
My swap.
My swap.
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
copy
=
copy
=
copy
=
copy
=
copy
=
=== PRINT ===
-19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0
Notinlist
  • 16,144
  • 10
  • 57
  • 99
  • BTW: SGI's sort uses http://en.wikipedia.org/wiki/Introsort, which changes tactics at some point. – Notinlist Jan 08 '13 at 10:37
  • You might need to implement a heap sort, so you can use only swap. For effective stable sort you will need copies and-or assignments. There is an in-place stable sort called "in-place merge sort" which may use only swap, but it's somewhat slower (n*logn*logn, not n*logn). – Notinlist Jan 08 '13 at 10:50
  • But `std::sort` doesn't have to be stable anyway. So swapping vs. not swapping doesn't affect the complexity here. – Steve Jessop Jan 08 '13 at 11:03
  • For 20 elements my own heap sort implementation used 62 swaps and nothing else. G++'s STL used 99 operations (35 copy, 19 assignment, 10 swap, 35 delete). Do not consider it as weapons grade! It lacks many many usability features. Link: http://pastebin.com/5SVm6Tz9 – Notinlist Jan 08 '13 at 14:24