2

After reading on a few places including here: Understanding dict.copy() - shallow or deep?

It claims that dict.copy will create a shallow copy otherwise known as a reference to the same values. However, when playing with it myself in python3 repl, I only get a copy by value?

a = {'one': 1, 'two': 2, 'three': 3}
b = a.copy()

print(a is b) # False
print(a == b) # True

a['one'] = 5
print(a) # {'one': 5, 'two': 2, 'three': 3}
print(b) # {'one': 1, 'two': 2, 'three': 3}

Does this mean that shallow and deep copies do not necessarily affect immutable values?

  • 1
    Try the same thing where the values of the dict are lists, and you append to them instead of reassigning the reference. You are actually seeing the difference between mutable and immutable values. – Patrick Haugh Jun 27 '18 at 18:37
  • That is a shallow copy. A deep copy would mean that if `a['one'] = [1,2,3]` and you changed `a['one'][0] = 5`, then `b['one']` would not be affected. – chepner Jun 27 '18 at 18:38
  • @Patrick Haugh Oh, I understand that part since you would effectively be storing objects in objects, but if the values are just immutable then is it essentially just a copy by value? – Stephen Gonzalez Jun 27 '18 at 18:38
  • @chepner but the examples shows that that isn't the case in this situation which is why I am curious if this only applies to mutable values – Stephen Gonzalez Jun 27 '18 at 18:40
  • There's no difference between a shallow and a deep copy in this case. Only when the root object contains nested dicts, lists or other data structures that can be mutated (or contain mutable objects) is there a difference between deep and shallow. – Håken Lid Jun 27 '18 at 18:41
  • @StephenGonzalez No, it doesn't. In your example, you are making `a['one']` reference a completely different object. In my example, I am mutating the current object referenced by `a['one']`. – chepner Jun 27 '18 at 18:44
  • @HåkenLid ahh gotcha. Thanks for this! – Stephen Gonzalez Jun 27 '18 at 18:44
  • @chepner Your example was something else. It was assuming I was using mutable types which i understand what happens in that scenario. I was referring to immutable types which others confirmed that shallowcopy and deep copy don't come into play here – Stephen Gonzalez Jun 27 '18 at 18:46

4 Answers4

5

Integers are inmutable, the problem comes when referencing objects, check this similar example:

import copy
a = {'one': [], 'two': 2, 'three': 3}
b = a.copy()
c = copy.deepcopy(a)
print(a is b) # False
print(a == b) # True

a['one'].append(5)
print(a) # {'one': [5], 'two': 2, 'three': 3}
print(b) # {'one': [5], 'two': 2, 'three': 3}
print(c) # {'one': [], 'two': 2, 'three': 3}

Here you have it live

Netwave
  • 40,134
  • 6
  • 50
  • 93
4

What you are observing has nothing to do with dictionaries at all. You are getting confused by the difference between binding and mutation.

Let's forget dictionaries at first, and demonstrate the issue with simple variables. Once we understand the fundamental point, we can then go back to the dictionary example.

a = 1
b = a
a = 2
print(b)  # prints 1
  • On the first line you create a binding between the name a and the object 1.
  • On the second line you create a binding between the name b and the value of the expression a ... which is the very same object 1 which was bound to the name a on the previous line.
  • On the third line you create a binding between the name a and the object 2, in the process forgetting that there ever was a binding between a and the 1.

It is vital to note that this last step cannot in any way affect b!

The situation is completely symmetric, so if line 3 were b = 2 this would have absolutely no effect on a.

Now, people often mistakenly claim that this is somehow a result of the immutability of integers. Integers are immutable in Python, but that is completely irrelevant. If we do something similar with some mutable objects, say lists, then we get equivalent results.

a = [1]
b = a
a = [2]
print(b) # prints [1]

Once again

  • a is bound to some object
  • b is bound to the same object
  • a is now rebound to some different object

This cannot affect b or the object to which it is bound [*] in any way! No attempt has been made anywhere to mutate any object, so mutability is completely irrelevant to this situation.

[*] actually, it does change the reference count of the object (at least in CPython) but that's not really an observable property of the object.

However, if, instead of rebinding a, we

  1. Use a to access the object to which it is bound
  2. Mutate that object

then we will affect b, because the object to which b is bound will be mutated:

a = [1]
b = a
a[0] = 2
print(b)  # prints [2]

In summary, you have to understand

  1. The difference between binding and mutation. The former affects a variable (or more generally a location) while the latter affects an object. Therein lies the key difference

  2. Rebinding a name (or location in general) cannot affect the object to which that name was previously bound (beyond changing its reference count).

Now, in your example you create something that looks (conceptually) like this:

a ---> { 'three' ----------------------> 3
         'two'   -------------> 2        ^
         'one'   ---> 1 }       ^        |
                      ^         |        |
                      |         |        |
b ---> { 'one'   -----          |        |
         'two'   ---------------         |
         'three' -------------------------

and then a['one'] = 5 simply rebinds the location a['one'] so that it is no longer bound to the 1 but to 5. In other words, that arrow coming out of the first 'one', now points somewhere else.

It is important to remember that this has absolutely nothing to do with the immutability of integers. If you make each and every integer in your example mutable (for example by replacing it with a list which contains it: i.e. replace every occurance of 1 with [1] (and similarly for 2 and 3)) then you will still observe essentially the same behaviour: a['one'] = [1] will not affect the value of b['one'].

Now, in this latest example, where the values stored in your dictionary are lists and therefore structured, it becomes possible to distinguish between shallow and deep copy:

  • b = a will not copy the dictionary at all: it will simply make b a new binding to the same single dictionary
  • b = copy.copy(b) will create a new dictionary with internal bindings to the same lists. The dictionary is copied but its contents (below the top level) are simply referenced by the new dictionary.
  • b = copy.deepcopy(a) will also create a new dictionary, but it will also create new objects to populate that dictionary, rather than referencing the original ones.

Consequently, if you mutate (rather than rebind) something in the shallow copy case, the other dictionary will 'see' mutation, because the two dictionaries share objects. This does not happen in the deep copy.

jacg
  • 2,040
  • 1
  • 14
  • 27
2

please consider this situation explained hence you will be able to understand the referencing and copy() method easily.

    dic = {'data1': 100, 'data2': -54, 'data3': 247}
    dict1 = dic
    dict2 = dic.copy()
    print(dict2 is dic)
    # False
    print(dict1 is dic)
    # true

First print statement prints false because dict2 and dic are 2 separate dictionary with separate memory spaces even though they have same contents. This happens when we use copy function. secondly when assigning dic to dict1 does not create a separate dictionary with separate memory spaces instead dict1 makes a refernce to dic.

1

A shallow copy of some container means that a new identical object is returned, but that its values are the same objects.

This means that mutating the values of the copy will mutate the values of the original. In your example, you are not mutating a value, you are instead updating a key.

Here is an example of value mutation.

d = {'a': []}

d_copy = d.copy()

print(d is d_copy) # False
print(d['a'] is d['a']) # True

d['a'].append(1)
print(d_copy) # {'a': [1]}

On the other side, a deepcopy of a container returns a new identical object, but where the values have been recursively copied as well.

Olivier Melançon
  • 21,584
  • 4
  • 41
  • 73