0

I was reading about shallow and deep copy in Python where I ran into the following sentence in the documentation:

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances)

I am confused about what is and isn't a compound object. Based on the above definition (objects that contain other objects) every container is a compound object since every element in a container is an object (because in Python everything is an object even Integer numbers) and the container itself is also an object so every container (with at least one element) is a compound object.

If we agree about what I said, then the first part of the quote about shallow and deep copy "The difference between shallow and deep copying is only relevant for compound objects" would be problematic because then shallow copy for a list of integers should not work while we know that it works:

pp = [1,2,3,4]
qq= copy.copy(pp)

pp[0] = 99

print(pp)
>> [99,2,3,4]

print(qq)
>> [1,2,3,4]

Can someone clarify the meaning of compound object?

flakes
  • 21,558
  • 8
  • 41
  • 88
amiref
  • 3,181
  • 7
  • 38
  • 62
  • 1
    Integers, strings and floats are not compound objects. – Tim Roberts Jul 12 '22 at 20:41
  • Try this. Create a list of integers (as you have done), however the *first element* is a sub-list of integers. Now, create a shallow copy and update the first element of the sub-list. Then, repeat with a deep copy. You’ll see the difference. – S3DEV Jul 12 '22 at 20:42
  • Perhaps [this post](https://stackoverflow.com/a/240205/6340496) might be of help to explain the details(?) – S3DEV Jul 12 '22 at 20:46
  • 1
    Another way to look at it: Copying is only useful for **mutable** objects. `copy` is useful for mutable objects, and `deepcopy` is useful for mutable compound objects where the children are mutable. Examples: 1) `int` doesn't need to be copied because it is **immutable**. 2) `list[int]` only needs `copy` because it is **mutable and is composed of immutable objects**. 3) `list[list[int]]` requires a `deepcopy` because it is **mutable and is composed of mutable objects**. – flakes Jul 12 '22 at 20:59
  • "every container is a compound object " yes, that is true. "would be problematic because then shallow copy for a list of integers should not work while we know that it works:" That doesn't make any sense. A shallow copy "works" precisely because you have knowledge and information about the nature of the objects in your compound object, it doesn't make the original claim any less true – juanpa.arrivillaga Jul 12 '22 at 22:41

1 Answers1

0

That sentence is poorly worded. What they mean is

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances) and in which any elements (at any nesting depth) are mutable.

In both shallow and deep copy, the top-level object is copied. But when you make a deep copy, all the mutable elements are also copied, and mutable elements contained within them recursively.

Also, immutable compound objects (e.g. tuples) that contain nested mutable objects are copied, e.g.

x = ([1, 2], [3, 4])
y = copy.copy(x)
z = copy.deepcopy(x)
id(x[0]) == id(y[0]) # True
id(x[0]) == id(z[0]) # False
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • No, it's also relevant for example for `[bytearray(1)]` (a `bytearray` is no compound object, but it's still mutable). – Kelly Bundy Jul 12 '22 at 20:57
  • @KellyBundy Whats your definition of compound here? I would have still called `bytearray` a compound object even though under the hood there is a single reference to some lowlevel array. – flakes Jul 12 '22 at 21:05
  • 1
    @flakes The most useful definition is probably that an object is compound if it can contain arbitrary objects. So even though strings and byte arrays allow you to index contained elements, they're considered more primitive because the container defines its contents. – Barmar Jul 12 '22 at 21:11
  • Thats a good point about immutable containers with mutable components. In my mental model I would have called `([1, 2], [3, 4])` a mutable object, but it's really a mix. Is there a better word for the overall mutability of an object? E.g. to distinguish this from a hierarchy where every composed object is immutable `((1, 2), (3, 4))` not just the parent object. – flakes Jul 12 '22 at 21:13
  • 2
    I don't think there's common terminology for this. – Barmar Jul 12 '22 at 21:14
  • @flakes The one from the quoted documentation: *"objects that contain other objects, like lists or class instances"*. An internal "low-level array" is not an "object in Python terms. Plus it could be integrated right there without an extra "reference" (actually *pointer*, since we're talking about C here), like for tuples. Or just imagine some other mutable object that just holds a single C int32 or so. – Kelly Bundy Jul 12 '22 at 21:18
  • @Barmar I see you removed the word "relevant" from the original quote. What difference does that make? To me it just reads less well. About your added "and any elements are mutable": That's also "poorly worded" then, since it only matters if any mutable element actually does get mutated. In other words, I think neither is poorly worded. – Kelly Bundy Jul 12 '22 at 21:24
  • @KellyBundy Just an editing mistake – Barmar Jul 12 '22 at 21:25
  • Also, the "and any elements are mutable" excludes cases like your example with tuples containing lists. Your modified quote is saying that shallow vs deep is not relevant for example for `[([1,2],3)]`. – Kelly Bundy Jul 12 '22 at 21:32
  • @flakes (correcting myself: bytearray probably can't work without internal pointer, as its size can vary, for example it has an `append` method. Never used that, I thought its size is fixed.) – Kelly Bundy Jul 12 '22 at 21:47
  • @KellyBundy I added more clarification: "at any nesting depth". – Barmar Jul 12 '22 at 22:40
  • 2
    It's really much easier to describe if you talk about the desired effect -- if there are any mutable objects reachable from the original object, deep copy means that modifying it in the original won't affect the copy. – Barmar Jul 12 '22 at 22:41