Should I provide a deep clone when implementing ICloneable?

Question

It is unclear to me from the MSDN documentation if I should provide a deep or a shallow clone when implementing ICloneable. What is the preferred option?

score 15 · Accepted Answer · answered Sep 30 '08 at 02:19

15

Short answer: Yes.

Long Answer: Don't use ICloneable. That is because .Clone isn't defined as being a shallow or a deep clone. You should implement your own IClone interface, and describe how the clone should work.

answered Sep 30 '08 at 02:19

MagicKat

9,695
6
32
43

Robert Gould · Answer 2 · 2008-09-30T02:25:29.583

1

Clones are deep by default, thats the naming convention and copy constructors can be shallow if they want, for performance reasons.

Edit: This naming convention goes beyond boundaries, its the same for .Net, Java, C++, Javascript, etc... the actual source is beyond my knowledge but its part of the standard Object Oriented lexicon, just like objects, and classes. Thus MSDN doesn't specify implementation because its a given by the word itself (of course lots of newcomers to OO languages don't know this, and they SHOULD specify it, but then again their documentation is quite frugal anyways)

edited Sep 30 '08 at 02:25

answered Sep 30 '08 at 02:18

Robert Gould

68,773
61
187
272

Could you provide more detail, whose naming convention? – Jim Burger Sep 30 '08 at 02:20
2

That isn't correct. MSDN docs don't say HOW Clone should be implemented. – MagicKat Sep 30 '08 at 02:20

score 1 · Answer 3 · answered Jan 01 '14 at 20:48

Given the way an object is defined, there shouldn't be any question about "deep cloning" versus "shallow cloning". If an object encapsulates the identities of things, a clone of the object should encapsulate the identities of the same things. If an object encapsulates the values of mutable objects, a copy should encapsulate detached mutable objects holding the same values.

Unfortunately, neither .NET or Java includes in the type system whether references are held to encapsulate identity, mutable value, both, or neither. Instead, they just use a single reference type and figure that code which owns the only copy of a reference, or owns the only reference to a container which holds the only copy of that reference, may use that reference to encapsulate either value or state. Such thinking might be tolerable for individual objects, but poses real problems when it comes to things like copying and equality testing operations.

If a class has a field Foo which encapsulates the state of a List<Bar> which is to encapsulate the identities of objects therein, and may in future encapsulate different objects' identities, then a clone of the Foo should hold a reference to a new list which identifies the same objects. If the List<Bar> is used to encapsulate the mutable states of the objects, then a clone should have a reference to a new list which identifies new objects that have the same state.

If objects included separate "equivalent" and "equals" methods, with hashcodes for each, and if for each heap object type there were reference types that were denoted as encapsulating identity, mutable state, both, or neither, then 99% of equality testing and cloning methods could be handled automatically. Two aggregates are equal if all components which encapsulate identity or mutable state are equivalent (not merely equal) and those that encapsulate neither are at least equal; two aggregates are equivalent only if all corresponding components are and always will be equivalent [this often implies reference equality, but not always]. Copying an aggregation requires making a detached copy of each constituent that encapsulates mutable state, copying the reference to each constituent that encapsulates identity, and doing either of the above for those which encapsulates neither; an aggregation with a constituent which encapsulates both mutable state and identity cannot be cloned simply.

There are a few tricky cases that such rules for cloning, equality, and equivalence wouldn't handle properly, but if there were a convention to distinguish a List<IdentityOfFoo> from a List<MutableStateOfFoo>, and to support both "equivalent" and "equals" tests, 99% of objects could have Clone, Equals, Equivalent, EqualityHash, and EquivalenceHash auto-generated and work correctly.

+1, good ideas. If I might add my 2 cents, switching to immutable objects (and immutable collections) easily resolves a lot of these kind of problems. Once you get rid of mutable state, identity is no longer important. — Anton Tykhyy, Jan 01 '14 at 21:22
@AntonTykhyy: It's often useful to have able to have a constant identity associated with something that isn't constant; pushing mutable state to the level of the identity is often more useful than confining all mutations to a higher level. For example, if one had a list of all the cars in a state and their whereabouts, even if could instantaneously replace with it a list that was identical except that the car associated with ID "177-1234" was replaced with a new car whose position was different, code that wanted to continuously know the whereabouts of that car would... — supercat, Jan 01 '14 at 21:30
...have to repeatedly ask the list for it. By contrast, if cars' positions are held in mutable objects, one can simply identify the car object once, and then repeatedly ask that same car for its current position. — supercat, Jan 01 '14 at 21:30
What's needed IMHO is language recognition that certain aspects of what's encapsulated by a reference should be included in the type system. Such a system shouldn't strive for absolutely perfect expressiveness, but should at least handle the common cases. Much as people moan about const-correctness in C++, what it largely does is expose problems in underlying designs which are apt to manifest themselves in the form of bugs if not in the form of compiler squawks. — supercat, Jan 01 '14 at 21:39

Should I provide a deep clone when implementing ICloneable?

3 Answers3

Linked