11

In c++ when classes contains dynamically allocated data it is usually reasonable to explicitly define copy constructor, operator= and destructor. But the activity of these special methods overlaps. More specifically operator= usually first does some destruction and then it does coping similar to the one in copy constructor.

My question is how to write this the best way without repeating the same lines of code and without the need for processor to do unnecessary work (like unnecessary copying).

I usually end up with two helping methods. One for construction and one for destruction. The first is called from both copy constructor and operator=. The second is used by destructor and operator=.

Here is the example code:

    template <class T>
    class MyClass
    {
        private:
        // Data members
        int count;
        T* data; // Some of them are dynamicly allocated
        void construct(const MyClass& myClass)
        {
            // Code which does deep copy
            this->count = myClass.count;
            data = new T[count];
            try
            {
                for (int i = 0; i < count; i++)
                    data[i] = myClass.data[i];
            }
            catch (...)
            {
                delete[] data;
                throw;
            }
        }
        void destruct()
        {
            // Dealocate all dynamicly allocated data members
            delete[] data;
        }
        public: MyClass(int count) : count(count)
        {
            data = new T[count];
        }
        MyClass(const MyClass& myClass)
        {
            construct(myClass);
        }
        MyClass& operator = (const MyClass& myClass)
        {
            if (this != &myClass)
            {
                destruct();
                construct(myClass);
            }
            return *this;
        }
        ~MyClass()
        {
            destruct();
        }
    };

Is this even correct? And is it a good habit to split the code this way?

David G
  • 94,763
  • 41
  • 167
  • 253
John Bumper
  • 301
  • 1
  • 8
  • +1 because the question helped raise my awareness. It looks like something I would have written, before reading the answers. – ToolmakerSteve Jul 10 '13 at 10:26
  • Hm, I rarely have duplicated code in both, since they both do entirely different things: one intializes, one assigns.... – PlasmaHH Jul 10 '13 at 10:35
  • its the "deep copy" nature of his class design that leads to duplication. – ToolmakerSteve Jul 10 '13 at 10:37
  • @PlasmaHH It depends. Think of a simple string or vector class, using deep copy semantics. (Whether the amount of duplicated code is enough to justify additional functions is a different question. If it's just a simple `new`, it may not be worth the bother of the separate function.) – James Kanze Jul 10 '13 at 10:38
  • 1
    [This is what I would have done](http://coliru.stacked-crooked.com/view?id=279d695a390110d821bf2566ec6855fd-f674c1a6d04c632b71a62362c0ccfc51) `assign`, `clear`, and `swap` do all the work. – Mooing Duck Jul 13 '13 at 00:04
  • Mooing Duck just placed a link but comes closest to answering the actual question, which is not about the copy-swap idiom but about avoiding code duplication – Fred Schoen Apr 09 '15 at 19:00

3 Answers3

7

One initial comment: the operator= does not start by destructing, but by constructing. Otherwise, it will leave the object in an invalid state if the construction terminates via an exception. Your code is incorrect because of this. (Note that the necessity to test for self assignment is usually a sign that the assignment operator is not correct.)

The classical solution for handling this is the swap idiom: you add a member function swap:

void MyClass:swap( MyClass& other )
{
    std::swap( count, other.count );
    std::swap( data, other.data );
}

which is guaranteed not to throw. (Here, it just swaps an int and a pointer, neither of which can throw.) Then you implement the assignment operator as:

MyClass& MyClass<T>::operator=( MyClass const& other )
{
    MyClass tmp( other );
    swap( tmp );
    return *this;
}

This is simple and straight forward, but any solution in which all operations which may fail are finished before you start changing the data is acceptable. For a simple case like your code, for example:

MyClass& MyClass<T>::operator=( MyClass const& other )
{
    T* newData = cloneData( other.data, other.count );
    delete data;
    count = other.count;
    data = newData;
    return *this;
}

(where cloneData is a member function which does most of what your construct does, but returns the pointer, and doesn't modify anything in this).

EDIT:

Not directly related to your initial question, but generally, in such cases, you do not want to do a new T[count] in cloneData (or construct, or whatever). This constructs all of the T with the default constructor, and then assigns them. The idiomatic way of doing this is something like:

T*
MyClass<T>::cloneData( T const* other, int count )
{
    //  ATTENTION! the type is a lie, at least for the moment!
    T* results = static_cast<T*>( operator new( count * sizeof(T) ) );
    int i = 0;
    try {
        while ( i != count ) {
            new (results + i) T( other[i] );
            ++ i;
        }
    } catch (...) {
        while ( i != 0 ) {
            -- i;
            results[i].~T();
        }
        throw;
    }
    return results;
}

Most often, this will be done using a separate (private) manager class:

//  Inside MyClass, private:
struct Data
{
    T* data;
    int count;
    Data( int count )
        : data( static_cast<T*>( operator new( count * sizeof(T) ) )
        , count( 0 )
    {
    }
    ~Data()
    {
        while ( count != 0 ) {
            -- count;
            (data + count)->~T();
        }
    }
    void swap( Data& other )
    {
        std::swap( data, other.data );
        std::swap( count, other.count );
    }
};
Data data;

//  Copy constructor
MyClass( MyClass const& other )
    : data( other.data.count )
{
    while ( data.count != other.data.count ) {
        new (data.data + data.count) T( other.date[data.count] );
        ++ data.count;
    }
}

(and of course, the swap idiom for assignment). This allows multiple count/data pairs without any risk of loosing exception safety.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • This is something revolutionary worth more than one + - "Note that the necessity to test for self assignment is usually a sign that the assignment operator is not correct." – SChepurin Jul 10 '13 at 10:32
  • @James Kanze: One case (which a coworker ran into) is when your assignment operator has to do a memcpy on one of its resources. In that case self-assignment becomes a necessity, no? – ForeverLearning Jul 10 '13 at 19:06
  • @Dilip Or he could do `memmove`, or (maybe) `std::copy` or maybe just an assignment between POD structs. I can't think of any case where `memcpy` is "required", or even where I would use it in C++. (In practice, `memcpy` will work if both addresses passed to it are identical, but formally, it's undefined behavior.) – James Kanze Jul 11 '13 at 08:00
0

Implement the assignment by first copying the right-hand side and then swapping with that. This way you also get exception safety, which your code above doesn't provide. You could end up with a broken container when construct() fails after destruct() succeeded otherwise, because the member pointer references some deallocated data, and on destruction that will be deallocated again, causing undefined behaviour.

foo&
foo::operator=(foo const& rhs)
{
   using std::swap;
   foo tmp(rhs);
   swap(*this, tmp);
   return *this;
}
juanchopanza
  • 223,364
  • 34
  • 402
  • 480
doomster
  • 171
  • 2
  • If construct fails, why is it better to end up with the old (and no longer intended) contents, instead of with an empty container? IMHO it would be cleaner to have a failed copy result in an empty container. Specifically, an empty container is easy to detect in later code; a container that has contents IT SHOULD NO LONGER HAVE is harder to detect later. – ToolmakerSteve Jul 10 '13 at 10:06
  • Assuming you want to terminate the program in case of such failure, either way works fine, but you still need to set `data = nullptr` (which the original code fails to do). – riv Jul 10 '13 at 10:10
  • @ToolmakerSteve It depends on what you want. The swap idiom ensures full transactional integrity; you either succeed, or nothing changes. In most cases, full transactional integrity isn't needed for individual objects; all that you must guarantee is that the object can be destructed in case of failure. (Trying to guarantee any actual state other than unchanged is probably not very useful.) – James Kanze Jul 10 '13 at 10:13
  • thx, yeah I just realized I was thinking of a different situation, where the end result is a null pointer. I understand what is being said now; as written, the end result was an object that had contents, but incomplete contents. – ToolmakerSteve Jul 10 '13 at 10:13
  • @ToolmakerSteve it is to provide the strong exception safety guarantee: leave everything the way it was before the exception. Whoever is handling the exception decides what action to take after that. – juanchopanza Jul 10 '13 at 10:16
  • 2
    @juanchopanza Even when you don't actually need the strong guarantee, the swap idiom is often the simplest and cheapest way to get the minimum guarantee. – James Kanze Jul 10 '13 at 10:19
0

I don't see any inherent problem with that, as long as you make sure not to declare construct or destruct virtual.

You might be interested in chapter 2 in Effective C++ (Scott Meyers), which is completely devoted to constructors, copy operators and destructors.

As for exceptions, which your code is not handling as it should, consider items 10 & 11 in More effective C++ (Scott Meyers).

Stefano Falasca
  • 8,837
  • 2
  • 18
  • 24
  • 1
    Except that it's not exception safe. If the `new` in `construct` throws (or the copying throws), then the object is left in an incoherent state, in which destructing it will cause undefined behavior. – James Kanze Jul 10 '13 at 10:10
  • @JamesKanze Of course you are right, but the question is about a technique for avoiding code duplication, and this technique doesn't have inherent problems, I think. – Stefano Falasca Jul 10 '13 at 10:12
  • It doesn't have inherent problems, _except_ that it doesn't work. Exception safety is not an optional feature; it is essential if the program is to be correct. – James Kanze Jul 10 '13 at 10:14