2

I was using lists in a program, but I don't understand this following behavior. I have begun to understand mutability and how it affects variable assignment, but I do not see the problem here:

class Test:
    def __init__(self, list_n):
        list_a = list_n[:]
        list_b = list_n[:]
        print(list_a is list_b) # Prints False
        print(list_a is list_n) # Prints False
        print(list_b is list_n) # Prints False

        list_a[0][0] = 1
        print(list_a) # Both of these print [[1,0,0][0,0,0][0,0,0]]
        print(list_b)

def main():
    list_n = [[0,0,0],[0,0,0],[0,0,0]]
    test = Test(list_n)      

if __name__ == '__main__': main()

Both list_a and list_b seem to still point to the same list, even though I thought I took the necessary measures to prevent that from happening.

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437

1 Answers1

11

Why didn't using slice notation to copy my list work?

The reason your example did not work is because you only made a shallow copy of list_n. A shallow copy of a list only copies the "top-level" list. Shallow copying a list does not copy sublists. Shallow copies of lists are made by calling copy() on the list(list.copy()) or using slice notation(list[:]).

At the C level, when a shallow copy is done on elements which are lists, pointers to the lists - called PyObjects - are being copied from one list to another. The actual pointer for each sublist however, is not copied, and thus list_a and list_b both contain pointers to the exact same sublists.

To put it plainly, you never made a copy of each sublist in list_n, and thus list_a and list_b still both contain pointers to the same sublists. This can be fixed by creating a "deepcopy" of list_n - a copy of each sublist in the original list regardless of the nesting level - using copy.deepcopy():

>>> from copy import deepcopy
>>> 
>>> list_n = [[0,0,0],[0,0,0],[0,0,0]]
>>> list_a = deepcopy(list_n)
>>> list_b = deepcopy(list_n)
>>> 
>>> list_a[0][0] = 1
>>> list_a
[[1, 0, 0], [0, 0, 0], [0, 0, 0]]
>>> list_b
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]
>>> 

When should I use deepcopy()?

One of the biggest drawbacks to using deepcopy() is that it takes a substantial amount of time to "deepcopy" shallowly nested lists.

If your list is only shallowly nested(two to three levels deep), one should simply use a nested list comprehension instead of deepcopy(). Using this method would be substantially more efficient(thanks to as @jonrsharpe for pointing this out):

>>> list_n = [[0,0,0],[0,0,0],[0,0,0]]
>>> list_a = [x[:] for x in list_n]
>>> list_b = [x[:] for x in list_n]

The efficency gained by using this method can be observed using the timeit module in the standard library:

>>> import timeit
>>> 
>>> setup="from copy import deepcopy; list_n = [[0,0,0],[0,0,0],[0,0,0]]"
>>> timeit.timeit(setup=setup, stmt="deepcopy(list_n)")
24.223977088928223
>>> timeit.timeit(setup=setup, stmt="[x[:] for x in list_n]")
1.2281990051269531
>>>  

However, if your list is any deeper, one should opt to use deepcopy() instead, even if it seems somewhat bulky. One usually never needs to sacrifice readability over efficiency. Furthermore, as the list comprehension becomes increasingly complex, the efficiency over deepcopy() begins to grow smaller.

Why is deepcopy() so slow?

The reason deepcopy() is so much slower than most other methods(thanks to @Felix for asking), is because deepcopy() does much more work than a simple list comprehension. unlike the list comprehension, deecopy() must work on arbitrarily nested list, possible with many levels of nesting. Thus, it is extreme overkill to use it on shallowly nested list, and will result in much slower execution time.

To get a better idea of what deepcopy() does behind the scenes you can take a look at the source code for the function, as it is open source and available to the public for viewing.

user2357112
  • 260,549
  • 28
  • 431
  • 505
Christian Dean
  • 22,138
  • 7
  • 54
  • 87
  • 3
    As it's only a single layer deep and homogeneous, I suspect `[l[:] for l in list_n]` would be more efficient than `deepcopy`. – jonrsharpe Jan 13 '17 at 23:25
  • 2
    Nested list comps would get ugly fast for deeper structures, it's true! It seems two orders of magnitude faster, though, so I thought I'd mention it. – jonrsharpe Jan 13 '17 at 23:29
  • 3
    @jonrsharpe I don't mind at all! In fact, you were completely right. Using the `timeit` module, timing `deepcopy()` took around 30.5 seconds, while your method took only about 1 second! Thanks for the update. – Christian Dean Jan 13 '17 at 23:38
  • 1
    Why is `deepcopy` so slow? It doesn't actually do more work than the list comprehension, does it? – Felix Dombek Jan 14 '17 at 05:02
  • @FelixDombek Thanks for asking Felix! I updated my answer to answer your question. The edit also included a link to the source for `deepcopy()` in case you wanted to take a look for yourself. – Christian Dean Jan 14 '17 at 05:47
  • 1
    If you pay for sth you don't use, it's badly designed. They should have implemented this in C. At this speed, cPickle/json are even faster than deepcopy ... – Felix Dombek Jan 14 '17 at 12:56