Deep-copying a structure containing reference members (C++)

Question

I have a C++ class that contains objects that have reference members pointing to other objects within the class. This seemed like a good idea at the time, but now I need to implement a deep copy of the whole thing, and I can't see a way to do that that doesn't feel like a clunky hack.

A simplified version of my code looks like this. The question is about writing a copy constructor for A.

class C {
    int x, y, z; // nothing complicated stored in this class
public:
    // constructor and other methods
};

class B {
    C &c1;
    C &c2;
public:
    // constructor and other methods
};

class A {
    C *c_array;
    B *b_array; // for each item in this array, 
                // its 'c1' and 'c2' fields point to members of c_array.
public:
    // constructor and other methods
};

A few people have asked how this structure is initialised, so please let me stress that this is irrelevant for answering the question. The initialisation will always satisfy the requirement that the reference members of the items in b_array point to items in c_array, but beyond that the data could be anything. It is important that the copy constructor work for any data that satisfies this property. This is not a problem that can be solved by reusing the existing initialisation code.

The problem is that if I just copy the objects in b_array, their reference members will point to the C objects in the old instance of A. I need to make them point to the corresponding items in the new instance. The only way I can think to do that is this:

for each element of b_array, get the address that its reference member points to, and store that in a pointer
work out the index into the array that that pointer corresponds to using pointer arithmetic
use this index to initialise the reference member of the corresponding element of the new b_array.

My question is, is there a cleaner / simpler / more elegant way? If there isn't, I will just refactor my design to use array indices instead of references throughout.

Perhaps I shouldn't have used reference members - I know some people say it's always better to use pointers. If I had used pointers instead of references, would there be a better solution to this problem? (I can't see one but I don't know.)

Interesting question! All of the 'deep copy' questions I have seen have had the reference structure known at compile time and hard-coded in the copy logic. — Graham Griffiths, Mar 05 '14 at 10:25
could you replace both of b_array and c_array with a single map ? — Graham Griffiths, Mar 05 '14 at 10:31
@GrahamGriffiths my actual `B` class stores multiple references to members of `c_array`. I suppose it could be implemented as several `map`s, but the real `B` is a class hierarchy that implements other code, so it would be a bit tricky to re-factor. (I also tend to avoid STL code, mostly out of habit I suppose.) — N. Virgo, Mar 05 '14 at 10:36
Without seeing how you currently initialize those arrays, it's hard to give any concrete suggestion — Useless, Mar 05 '14 at 11:35
@Useless it doesn't matter how they're initialised - the copy constructor doesn't need to know that. — N. Virgo, Mar 05 '14 at 11:38
The copy constructor needs to initialize the copy, and you have some existing code to initialize the originals. All the important details are coming out in dribs and drabs in the comments, which makes it hard to guess what a solution needs to provide, or what existing (un-shown) code it could reuse — Useless, Mar 05 '14 at 11:41
@Useless all the relevant information is in the question. I've offered a few clarifications in the comments, but those were clarifications of things people didn't understand, not information I left out originally. — N. Virgo, Mar 05 '14 at 11:42

score 1 · Answer 1 · edited May 23 '17 at 11:57

1

A deep copy using references will have to first copy the values (c_array) and then store references to those new values. I cannot see a way of achieving this other than the algorithm you describe.

Using pointers instead of references will not change this. There are various comparisons of pointers vs references. Pointers are less tricky to initialise (create as null and assign when you are ready) and more dangerous to use (might still be null). But you will still have to copy the objects, then find + copy the links to them.

I cannot see a simpler / more elegant way than using array indices. With array indices you will just copy the arrays value by value, and the structure of which index points to which object will be taken care of for you.

edited May 23 '17 at 11:57

Community

1
1

answered Mar 05 '14 at 10:22

Graham Griffiths

2,196
1
12
15

The only trouble with array indices is that if I replace `c_array` with something more mutable like a `std::list` then the indices all have to be updated when its contents change, and that would lead to hard-to-track-down errors if they get out of sync. (Currently my code doesn't do that, but I might well want to do that in the future.) Of course, I now see that the implementation based on reference members also has ways to introduce slippery errors. Your answer is probably right, but I'll leave it open in case someone has a clever solution. – N. Virgo Mar 05 '14 at 10:43
I think pointer arithmetic is better than `std::find`, because `std::find` has to do a linear search, whereas pointer arithmetic could tell me the answer in a single operation. I already have the memory address, so I don't need to search for the element, just subtract it from the value of the `c_array` pointer. – N. Virgo Mar 05 '14 at 10:50

mockinterface · Answer 2 · 2014-03-05T12:19:58.960

1

You can provide an assignment operator (and a copy ctor) for A that deals with the change of both c_array and b_array in tandem, assuming that B can handle the assignment of C as just a reference/pointer update.

It can be along the lines of:

struct B { ... B& operator=(const C& c) { this->c = &c; return *this; } };

struct A { ...
    A& operator=(const A& a) {
        c_array = a.c_array;
        b_array = a.b_array;
        // re-assign pointers/references only using B's `operator=(const C&)` :
        std::copy_n(c_array.begin(), c_array.size(), b_array.begin());
        return *this;
    }
};

A bit messy, but: see live example

Note that if you comment out the std::copy_n line, you can of course observe in the output that the copy isn't detached, that is, the b_array of the copy points to the original's c_array instead of its own.

edited Mar 05 '14 at 12:19

answered Mar 05 '14 at 12:09

mockinterface

14,452
5
28
49

Thanks, it's a nice idea, but unfortunately it isn't possible in my case, because `B` contains more than one reference to a `C`, as well as other data. (I had tried to reduce the problem down to the simplest version, but I now realise I should have made it clear that `B` contains other data - I've updated the example code in the question. Sorry about that.) – N. Virgo Mar 06 '14 at 07:26
It's understood that `B` contains other data, but that's what the plain assignment operator and the copy ctor of `B` are for. My point was that after you've implemented both the op and the copy ctor, follow it up by a number of reference-only rewrites using the technique above - combining a simple `std::copy` with a custom `operator=` to spare the tediousness of manually re-assigning all the references, I'm arguing for plain `B::operator=(const B&)` together with `B::operator=(const C&)`. – mockinterface Mar 06 '14 at 08:53

Deep-copying a structure containing reference members (C++)

2 Answers2