7

I recently read (and unfortunately forgot where), that the best way to write operator= is like this:

foo &operator=(foo other)
{
    swap(*this, other);
    return *this;
}

instead of this:

foo &operator=(const foo &other)
{
    foo copy(other);
    swap(*this, copy);
    return *this;
}

The idea is that if operator= is called with an rvalue, the first version can optimize away construction of a copy. So when called with a rvalue, the first version is faster and when called with an lvalue the two are equivalent.

I'm curious as to what other people think about this? Would people avoid the first version because of lack of explicitness? Am I correct that the first version can be better and can never be worse?

Jon Seigel
  • 12,251
  • 8
  • 58
  • 92
R Samuel Klatchko
  • 74,869
  • 16
  • 134
  • 187
  • What is `swap`? If it is `foo temp=x; x=y; y=temp;` you have the infinte recursion of the `operator=` and `swap` function. – Alexey Malistov Jan 09 '10 at 21:55
  • 1
    I wrote a program to test the theory of @Alexey Malistov and he was correct -- I got infinite recursion. – Brent Bradburn Jan 09 '10 at 23:26
  • Any class that implements a copy and swap idiom can't rely on the default `std::swap` implementation in its copy assignment operator. That pretty much goes without saying. – CB Bailey Jan 09 '10 at 23:37
  • As was pointed out in comments to the referenced article (http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/), the copy-and-swap idiom should explicitly specify the use of member swap to avoid this ambiguity. – Brent Bradburn Jan 09 '10 at 23:55
  • I think the best answer to this question is here: http://en.wikibooks.org/w/index.php?title=More_C%2B%2B_Idioms/Copy-and-swap&oldid=1648395. – Brent Bradburn Jan 09 '10 at 23:57

5 Answers5

4

You probably read it from: http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

I don't have much to say since I think the link explains the rationale pretty well. Anecdotally I can confirm that the first form results in fewer copies in my builds with MSVC, which makes sense since compilers might not be able to do copy-elision on the second form. I agree that the first form is a strict improvement and is never worse than the second.

Edit: The first form might be a bit less idiomatic, but I don't think it's much less clear. (IMO, it's not any more surprising than seeing the copy-and-swap implementation of the assignment operator for the first time.)

Edit #2: Oops, I meant to write copy-elision, not RVO.

jamesdlin
  • 81,374
  • 13
  • 159
  • 204
  • 1
    I disagree with the first being idiomatic. This is the less common version. I think if you work for a big company that does lots of code review you will find yourself being sent back to do in the normal way. Normal is better for maintenance as people don't need to do a double take to work out what is happening. I think the first version is just a nice party trick to show how cool you are. – Martin York Jan 09 '10 at 20:26
  • 1
    I didn't quite say that the first form was idiomatic (and I did say that the second form was more so). But idioms have to start somewhere, and I know the second form made me do a double take the first time I saw it. – jamesdlin Jan 09 '10 at 20:34
  • 1
    @Martin: as Herb Sutter has said, when convention conflicts with good practice, we have to ask ourselves whether we should discard convention in favour of good practice, or discard good practice in favour of convention. – Steve Jessop Jan 09 '10 at 22:21
  • 1
    The first version isn't *strictly* better. As the article explains, in cases where operator= is not inlined, it's likely to result in the copy code (or the call to it) being emitted at each operator= call site, rather than being emitted once in the code for operator=. For a given class that's unlikely to result in significant code bloat, but it's possible. – Steve Jessop Jan 09 '10 at 22:23
  • @SteveJessop: I now disagree with my original statement. I think the fist version has become idiomatic :-) – Martin York Feb 23 '12 at 17:30
2

I generally prefer the second one from readability and 'least surprise' point of view, however I do acknowledge that the first one can be more efficient when the parameter is a temporary.

The first one really can lead to no copies, not just the single copy and it's conceivable that this may be a genuine concern in extreme situations.

E.g. Take this test program. gcc -O3 -S (gcc version 4.4.2 20091222 (Red Hat 4.4.2-20) (GCC)) generates one call to B's copy constructor but no calls to A's copy constructor for the function f (the assignment operator is inlined for both A and B). A and B can both be taken to be very basic string classes. Allocation and copying for data would occur in the constructors and deallocation in the destructor.

#include <algorithm>

class A
{
public:
    explicit A(const char*);
    A& operator=(A val)      { swap(val); return *this; }
    void swap(A& other)      { std::swap(data, other.data); }
    A(const A&);
    ~A();

private:
    const char* data;
};

class B
{
public:
    explicit B(const char*);
    B& operator=(const B& val)  { B tmp(val); swap(tmp); return *this; }
    void swap(B& other)         { std::swap(data, other.data); }
    B(const B&);
    ~B();

private:
    const char* data;
};

void f(A& a, B& b)
{
    a = A("Hello");
    b = B("World");
}
CB Bailey
  • 755,051
  • 104
  • 632
  • 656
  • It's noticeable that neither class has any data members. Also, could you edit your example to minimise vertical white space somewhat? –  Jan 09 '10 at 20:08
  • @Neil: Data members make no difference, I just added a `long data[32];` to both structs in my test harness. In theory, the data member could be touched by the constructors which I've left undefined. The calls generated to the constructors stays the same, one call to `B::B(const B&)`, zero calls to `A::A(const A&)`. – CB Bailey Jan 09 '10 at 20:14
  • Ah, I think I see what you mean. I added another data member of class type (class C) with declared but undefined constructor, copy constructor, destructor and a void* data member. Although there was some extra exception handling code generated, and some calls to `C::~C()` in the expected places, the difference between calling `B::B(const B&)` and `A::A(const A&)`. Whether this makes any real world difference is open to debate, but the possibility is certainly there. – CB Bailey Jan 09 '10 at 20:46
  • Change your class into a simple string class (constructor, copy constructor, op= and destructor only) that uses to a dynamic array of char (only about 15 lines of code). I would do this myself, but frankly have had one too many glasses of wine. I think you will find find that copies cannot be so easily elided. –  Jan 09 '10 at 21:20
  • @Neil: I've updated the example so that A and B can be taken to be very basic string classes. The results are unchanged in terms of the copy constructor counts but I think it makes the point a bit better. – CB Bailey Jan 09 '10 at 23:32
0

Given this

foo &foo::operator=(foo other) {/*...*/ return *this;}
foo f();

in code like this

foo bar;
bar = f();

it might be easier for a compiler to eliminate the call to the copy constructor. With RVO, it could use the address of the operator's other parameter as the place for f() to construct its return value at.

It seems that this optimization is also possible for the second case, although I believe it might be harder. (Especially when the operator isn't inlined.)

sbi
  • 219,715
  • 46
  • 258
  • 445
-2

These two are actually the same. The only difference is where you press "Step In" in a debugger. And you should know in advance where to do it.

alemjerus
  • 8,023
  • 3
  • 32
  • 40
-3

I think that you might be confusing the difference between:

foo &operator=(const foo &other); and
const foo &operator=(const foo &other);

The first form should be used to allow for: (a = b) = c;

doron
  • 27,972
  • 12
  • 65
  • 103
  • 4
    I don't think he is confusing those - his question makes perfect sense. –  Jan 09 '10 at 19:48