27

The official Python docs say that using the slicing operator and assigning in Python makes a shallow copy of the sliced list.

But when I write code for example:

o = [1, 2, 4, 5]
p = o[:]

And when I write:

id(o)
id(p)

I get different id's and also appending one one list does not reflect in the other list. Isn't it creating a deep copy or is there somewhere I am going wrong?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
user2528042
  • 419
  • 2
  • 6
  • 12

2 Answers2

41

You are creating a shallow copy, because nested values are not copied, merely referenced. A deep copy would create copies of the values referenced by the list too.

Demo:

>>> lst = [{}]
>>> lst_copy = lst[:]
>>> lst_copy[0]['foo'] = 'bar'
>>> lst_copy.append(42)
>>> lst
[{'foo': 'bar'}]
>>> id(lst) == id(lst_copy)
False
>>> id(lst[0]) == id(lst_copy[0])
True

Here the nested dictionary is not copied; it is merely referenced by both lists. The new element 42 is not shared.

Remember that everything in Python is an object, and names and list elements are merely references to those objects. A copy of a list creates a new outer list, but the new list merely receives references to the exact same objects.

A proper deep copy creates new copies of each and every object contained in the list, recursively:

>>> from copy import deepcopy
>>> lst_deepcopy = deepcopy(lst)
>>> id(lst_deepcopy[0]) == id(lst[0])
False
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Then why is it I am getting different ids – user2528042 Sep 28 '13 at 15:51
  • 3
    @user2528042: Because the *outer list objects* are not the same. – Martijn Pieters Sep 28 '13 at 15:51
  • @user2528042 Because the original list *is copied* to a new object. Just all elements within are not copied, so if the list contains a mutable object (ints are not mutable) changing that object will change it in both the original and the copied list because both have a copy of the reference to the same object. – poke Sep 28 '13 at 15:53
  • 4
    @user2528042: that it's a shallow copy doesn't mean that `id(o) == id(p)`, it means that `[id(x) for x in o] == [id(x) for x in p]`. See the difference? – DSM Sep 28 '13 at 15:54
  • Is the effective difference only important for nested types and in-place list modifying methods like `append()` ? i.e. Instead of your example `lst_copy[0]['foo'] = 'bar'` if your type is `List[int]` and you do `>>>lst_copy[0] = 42 ... >>> list` you don't see 42 in the original list. – Davos Dec 11 '19 at 15:28
  • 1
    @Davos: `lst_copy[0]['foo'] = 'bar'` doesn't alter `list_copy`, it only alters the dictionary object referenced by `lst_copy[0]`. In Python [everything is basically a reference](https://nedbatchelder.com/text/names.html), including list indices. A shallow copy of a list creates copies of the references, and does not follow the references to make copies of the referenced objects. – Martijn Pieters Dec 12 '19 at 18:12
7

You should know that tests using is or id can be misleading of whether a true copy is being made with immutable and interned objects such as strings, integers and tuples that contain immutables.

Consider an easily understood example of interned strings:

>>> l1=['one']
>>> l2=['one']
>>> l1 is l2
False
>>> l1[0] is l2[0]
True

Now make a shallow copy of l1 and test the immutable string:

>>> l3=l1[:]
>>> l3 is l1
False
>>> l3[0] is l1[0]
True

Now make a copy of the string contained by l1[0]:

>>> s1=l1[0][:]
>>> s1
'one'
>>> s1 is l1[0] is l2[0] is l3[0]
True               # they are all the same object

Try a deepcopy where every element should be copied:

>>> from copy import deepcopy
>>> l4=deepcopy(l1)
>>> l4[0] is l1[0]
True

In each case, the string 'one' is being interned into Python's internal cache of immutable strings and is will show that they are the same (they have the same id). It is implementation and version dependent of what gets interned and when it does, so you cannot depend on it. It can be a substantial memory and performance enhancement.

You can force an example that does not get interned instantly:

>>> s2=''.join(c for c in 'one')
>>> s2==l1[0]
True
>>> s2 is l1[0]
False

And then you can use the Python intern function to cause that string to refer to the cached object if found:

>>> l1[0] is s2
False
>>> s2=intern(s2)
>>> l1[0] is s2
True

Same applies to tuples of immutables:

>>> t1=('one','two')
>>> t2=t1[:]
>>> t1 is t2
True
>>> t3=deepcopy(t1)
>>> t3 is t2 is t1
True

And mutable lists of immutables (like integers) can have the list members interred:

>>> li1=[1,2,3]
>>> li2=deepcopy(li1)
>>> li2 == li1
True
>>> li2 is li1
False
>>> li1[0] is li2[0]
True

So you may use python operations that you KNOW will copy something but the end result is another reference to an interned immutable object. The is test is only a dispositive test of a copy being made IF the items are mutable.

dawg
  • 98,345
  • 23
  • 131
  • 206
  • On Python 3, `intern` has been moved to the `sys` module, so you need to do `import sys; s=sys.intern(s)` – dawg Sep 28 '13 at 20:20