0

My understanding of deep-copies is that they replace references to objects with copies as new objects. Then,

Consider that:

>>> o = [1, 2, 3]
>>> l = [o]
>>> c = deepcopy(l)
>>> c[0] is l[0]
False

Compared to this:

>>> o = (1, 2, 3)
>>> l = [o]
>>> c = deepcopy(l)
>>> c[0] is l[0]
True

Why is the behaviour different?

Ben Smyth
  • 11
  • 2
  • 2
    Tuples are immutable; lists are not. – Scott Hunter May 31 '22 at 18:40
  • 4
    `deepcopy` is a red herring. `(1,2,3) is (1,2,3)` returns true, and `[1,2,3] is [1,2,3]` returns false. – John Gordon May 31 '22 at 18:42
  • I see. And that's because of immutability, right? – Ben Smyth May 31 '22 at 18:47
  • _sort of_, it's because they're not the same reference (which is compared by `is`, rather than `==` (`.__eq__`)), while it's (presumably) an easy optimization in cPython (or whichever implementation you're using) to increment the reference count and refer to the immutable tuple – ti7 May 31 '22 at 18:49
  • 2
    "Why is the behaviour different?" Because `deepcopy` never copies immutable built-ins as an optimization. You should *never care* what `x is y` returns when `x` and `y` are immutable types – juanpa.arrivillaga May 31 '22 at 18:50
  • `copy.deepcopy` is a no-op for tuples only if the tuple contains immutable objects. If you use `o = ([1], [2], [3])` in your second example, then `c[0] is l[0]` returns False. [Source](https://github.com/python/cpython/blob/3.7/Lib/copy.py#L220-L234). – matias May 31 '22 at 18:52
  • Thank you all, makes good sense! – Ben Smyth May 31 '22 at 18:54
  • 2
    Further, whether or not `deepcopy` makes that particular optimization shouldn't matter; don't write code that *depends* on it. The actual use cases for `is` is fairly small and limited to much simpler situations than shown here. – chepner May 31 '22 at 18:59
  • The difference between shallow and deep copying only matters if you want to modify a nested object and not have it reflect in the copy. But since you can't modify a tuple, this makes no difference, so as an optimization it doesn't bother to copy the tuple. – Barmar May 31 '22 at 19:48
  • What about frozenset? It's immutable but it returns a different object when using `deepcopy`. – Ben Smyth May 31 '22 at 20:07
  • @BenSmyth then perhaps simply the optimization is not being made - this may even vary between versions of Python or if you do it in the interpreter or a script and you should use `==` unless you are explicitly curious about whether something is the _same reference_ or if you are comparing to a singleton (`is True`, `is None`..) – ti7 May 31 '22 at 20:26
  • Does this answer your question? [Is there a difference between "==" and "is"?](https://stackoverflow.com/questions/132988/is-there-a-difference-between-and-is) – ti7 May 31 '22 at 20:27
  • @ti7 I understand the difference. Like you said, I am curious about how the object references are being handled in these cases. – Ben Smyth May 31 '22 at 20:44

2 Answers2

1

deepcopy is redundant for immutable objects, because there's no practical way to tell the difference between a copy and the original. Yes you can use is or id() but those don't tell you much about the object itself.

A tuple is immutable, as long as all of the elements it contains are immutable. Numbers and strings are immutable so they make good tuple members. A list is never immutable.

A class may implement a method __deepcopy__ and if it does, that function will be called to make the copy. That function may return the original object or a new object depending on the properties of the class.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
0

As John Gordon pointed out in the comments, this has nothing to do with deepcopy. Looks like your Python implementation re-uses the same object for equal tuples of literals, no matter where they appear. As in:

a = (1, 2, 3)
b = (1, 2, 3)
a is b  # True

This is an implementation detail that you cannot rely on. In CPython, the most common implementation, that line used to evaluate to False. Nobody guarantees that it won't return False again next year. I plugged the same into an online interpreter that relies on the Skulpt implementation and it returns False too (code).

Just use == for comparison!

Why does CPython store equal tuples in the same memory location? I can only speculate but it's probably to conserve memory.

Joooeey
  • 3,394
  • 1
  • 35
  • 49