5

The following "minimal" example should show the use of rule of 3 (and a half).

#include <algorithm>
#include <iostream>

class C
{
    std::string* str;
public:
    C()
        : str(new std::string("default constructed"))
    {
        std::cout << "std ctor called" << std::endl;
    }
    C(std::string* _str)
        : str(_str) 
    {
        std::cout << "string ctor called, "
            << "address:" << str << std::endl;
    }
    // copy ctor: does a hard copy of the string
    C(const C& other)
        : str(new std::string(*(other.str)))
    {
        std::cout << "copy ctor called" << std::endl;
    }

    friend void swap(C& c1, C& c2) {
        using std::swap;
        swap(c1.str, c2.str); 
    }

    const C& operator=(C src) // rule of 3.5
    {
        using std::swap;
        swap(*this, src);
        std::cout << "operator= called" << std::endl;
        return *this;
    }

    C get_new() {
        return C(str);
    }
    void print_address() { std::cout << str << std::endl; }
};

int main()
{
    C a, b;
    a = b.get_new();
    a.print_address();
    return 0;
}

Compiled it like this (g++ version: 4.7.1):

g++ -Wall test.cpp -o test

Now, what should happen? I assumed that the line a = b.get_new(); would make a hard copy, i.e. allocate a new string. Reason: The operator=() takes its argument, as typical in this design pattern, per value, which invokes a copy ctor, which will make a deep copy. What really happened?

std ctor called
std ctor called
string ctor called, address:0x433d0b0
operator= called
0x433d0b0

The copy ctor was never being called, and thus, the copy was soft - both pointers were equal. Why is the copy ctor not being called?

Community
  • 1
  • 1
Johannes
  • 2,901
  • 5
  • 30
  • 50
  • It's allowed to be elided. – chris Oct 11 '13 at 19:09
  • @chris Allowed by return value optimization? How can I force it to be called? – Johannes Oct 11 '13 at 19:10
  • @Johannes You can't, it's part of the language. You must code the class (especially the copy ctor) so that it can be elided. Which in your case means `get_new()` must pass a newly allocated string to the ctor it invokes. – Angew is no longer proud of SO Oct 11 '13 at 19:14
  • It's not being returned, so not RVO, but parameters by value are permitted not to be copied as well. There's something like `-fno-elide-copies` for GCC. – chris Oct 11 '13 at 19:15

3 Answers3

4

The copies are being elided.

There's no copy because b.get_new(); is constructing its 'temporary' C object exactly in the location that ends up being the parameter for operator=. The compiler is able to manage this because everything is in a single translation unit so it has sufficient information to do such transformations.

You can eliminate construction elision in clang and gcc with the flag -fno-elide-constructors, and then the output will be like:

std ctor called
std ctor called
string ctor called, address:0x1b42070
copy ctor called
copy ctor called
operator= called
0x1b420f0

The first copy is eliminated by the Return Value Optimization. With RVO the function constructs the object that is eventually returned directly into the location where the return value should go.

I'm not sure that there's a special name for elision of the second copy. That's the copy from the return value of get_new() into the parameter for operator= ().

As I said before, eliding both copies together results in get_new() constructing its object directly into the space for the parameter to operator= ().


Note that both pointers being equal, as in:

std ctor called
std ctor called
string ctor called, address:0xc340d0
operator= called
0xc340d0

does not itself indicate an error, and this will not cause a double free; Because the copy was elided, there isn't an additional copy of that object retaining ownership over the allocated string, so there won't be an additional free.

However your code does contain an error unrelated to the rule of three: get_new() is passing a pointer to the object's own str member, and the explicit object it creates (at the line "string ctor called, address:0xc340d0" in the output) is taking ownership of the str object already managed by the original object (b). This means that b and the object created inside get_new() are both attempting to manage the same string and that will result in a double free (if the destructor were implemented).

To see this change the default constructor to display the str it creates:

C()
    : str(new std::string("default constructed"))
{
    std::cout << "std ctor called. Address: " << str << std::endl;
}

And now the output will be like:

std ctor called. Address: 0x1cdf010
std ctor called. Address: 0x1cdf070
string ctor called, address:0x1cdf070
operator= called
0x1cdf070

So there's no problem with the last two pointers printed being the same. The problem is with the second and third pointers being printed. Fixing get_new():

C get_new() {
    return C(new std::string(*str));
}

changes the output to:

std ctor called. Address: 0xec3010
std ctor called. Address: 0xec3070
string ctor called, address:0xec30d0
operator= called
0xec30d0

and solves any potential problem with double frees.

bames53
  • 86,085
  • 15
  • 179
  • 244
  • Thanks. Instead, I tried Wikipedia's implementation for `operator=` in their description of [the rule of three](http://en.wikipedia.org/wiki/Rule_of_three_%28C%2B%2B_programming%29). This works now. – Johannes Oct 11 '13 at 19:39
  • @Johannes If you were running into a problem with double frees that has nothing to do with any problem in your original implementation of the rule of three here. The problem is in `get_new()`, where you pass `str` into the new object you're creating. The `C(std::string*)` constructor takes ownership over the string you pass in, but since in `get_new()` you pass in a string that's already owned you end up with a string owned by two objects. The correct implementation of `get_str()` would be like: `C get_new() { return C(new std::string(*str)); }` – bames53 Oct 11 '13 at 19:56
  • True, though this was only a simple test without any `free`s/`delete`s. In my original approach, I have a shared pointer. The `get_new` was really supposed to work like that. Probably a bad example for the rule of 3. – Johannes Oct 11 '13 at 20:11
  • @Johannes Okay, if you're using `shared_ptr` and the problem isn't a double free but a failure to get a deep copy when you want one, I think the problem is still in `get_new()`. `get_new()` is explicitly creating a shallow copy by using `C(string*)`, and you're relying on some implicit copies to do a deep copy of that shallow copy. If you want a deep copy I think it's better to explicitly make one right there in `get_new()` instead of trying to foil copy elision. – bames53 Oct 11 '13 at 20:40
  • In particular it seems perfectly reasonable to me that assignment would not always result in a copy being made; if the compiler can determine that the source of the assignment is a temporary object then stealing its guts (or just eliding copies) ought to be a good optimization. – bames53 Oct 11 '13 at 20:48
  • Yes, I agree. However, I think if we use an explicit copy ctor in `operator=` (like on Wikipedia), then I don't see anything bad about letting `get_new()` make a soft copy. – Johannes Oct 12 '13 at 04:30
  • @Johannes If you're guaranteed to always use them together then there's no functional difference, but you may want them to work independently. For example should `C &&a = b.get_new();` result in a shallow copy? Should `C a; a = C();` result in an extra allocation or should the compiler be able to optimize it? – bames53 Oct 12 '13 at 05:26
3

C++ is allowed to optimize away copy construction in functions that are returning a class instance.

What happens in get_new is that the object freshly constructed from _str member is returned directly and it's then used as the source for the assignment. This is called "Return Value Optimization" (RVO).

Note that while the compiler is free to optimize away a copy construction still it's required to check that copy construction can be legally called. If for example instead of a member function you have a non-friend function returning and instance and the copy constructor is private then you would get a compiler error even if after making the function accessible the copy could end up optimized away.

6502
  • 112,025
  • 15
  • 165
  • 265
1

It isn't exactly clear why you expect the copy ctor to be used. The get_new() function will not create a new copy of the C object when it returns the value. This is an optimization called Return Value Optimization, any C++ compiler implements it.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • Does this mean that for the rule of three, you can never rely on the `operator=` to do hard copies in general? – Johannes Oct 11 '13 at 19:22
  • RVO is very specific to returning values from functions and has no relevance to the rule of three. – Hans Passant Oct 11 '13 at 19:24
  • Oh, indeed, Wikipedia implements `operator=` with a const reference and then calls the copy ctor explicitly. [This example here](http://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap-idiom) does not seem to work with this kind of optimization. – Johannes Oct 11 '13 at 19:29