1

I have a global list (of lists) variable. I send a shallow copy of the global list to another function as a parameter.

Surprisingly, the original list gets changed when I remove some elements from the parameter within the invoked function.

Can someone please tell me why it's happening and how to prevent this from happening?

Here is the simplified code example:

def saveCandidateRoutes(candidateRoutes):

        for route in candidateRoutes:
            if route:       # check if the list, 'route', is empty. 
                tweetId = route.pop(0)
                stanox = route.pop(-1)
                ....

def main():

        global allCandidatePaths

        copyOfAllCandidatePaths= list(allCandidatePaths) # making a deep copy 
        result = saveCandidateRoutes(copyOfAllCandidatePaths)
Newbie
  • 95
  • 9
  • 6
    Well, because as you said, it is a shallow copy. Why did you expect otherwise? Look into `copy.deepcopy`. – timgeb Jan 02 '16 at 23:31
  • Isn't a shallow copy is like a value type parameter? – Newbie Jan 02 '16 at 23:33
  • 2
    I don't understand that sentence. – timgeb Jan 02 '16 at 23:34
  • 2
    You didn't remove elements from `candidateRoutes`. You removed elements from lists *inside* `candidateRoutes` which is exactly what isn't copied in a shallow copy. – BrenBarn Jan 02 '16 at 23:40
  • 1
    @timgeb Is that what it was supposed to be? – KSFT Jan 02 '16 at 23:44
  • @BrenBarn, @KSFT Yes, that was the case. BrenBran is right to point out the mistake. In fact, I wasn't making a shallow copy of the list of lists, `allCandidatePaths`. I was making a deep copy of it, as you can see from my code: copyOfAllCandidatePaths= list(allCandidatePaths) But I was confused with the terms and I thought I was making a shallow copy of it, as you can see my comment in the original post (question). However, the problem was that I made a deep copy of the outer list only. Hence, the cloned list, `copyOfAllCandidatePaths`, was having a shallow copy of the inner lists. – Newbie Jan 03 '16 at 10:55
  • (Continued from my prev. post...) As a result, when I removed elements of an inner list (`route`) , the sub-elements of the original list (`allCandidatePaths`) also changed. It's noteworthy that if I remove any element (not sub-elements) from the cloned list `copyOfAllCandidatePaths`, the elements of the original list remains unchanged. This is due to the fact that `copyOfAllCandidatePaths` is a 'partial deep copy' of the original list. I called it a 'partial deep copy' because `copyOfAllCandidatePaths` is constituted of a deep copy of the outer list and a shallow copy of the inner lists. – Newbie Jan 03 '16 at 10:58
  • 1
    @Newbie: If you made a copy of the outer list only, that wasn't a deep copy, it was a shallow copy. You should read up on what the terms "shallow copy" and "deep copy" mean. – BrenBarn Jan 03 '16 at 18:36
  • @BrenBran, Thank you for your useful feedback and of course, for pointing out the problem first. – Newbie Jan 03 '16 at 20:36

4 Answers4

2

python uses references to object, it means that both allCandidatePaths and candidateRoutes point to the same list in memory, and you can use both of them to change the list.

To prevent this from happening, in the beginning of your function saveCandidateRoutes add this instruction candidateRoutes = list(candidateRoutes). The list() function will create another copy of your original list in memory and assign its reference to candidateRoutes.

So when you use candidateRoutes, you will not be working on your original list that is in the main function, but you will be working on another list.

Martin Serrano
  • 3,727
  • 1
  • 35
  • 48
Sidahmed
  • 792
  • 1
  • 12
  • 23
2

I think you need a quick reminder on shallow and deep copies, and how to make a deep copy.

>>> a = [[1,2], [3,4]] # a list of mutable elements
>>> b = a[:]
>>> c = list(a)

Both b and c are shallow copies of a, you can check that a, b and c are different objects because they do not share the same id.

>>> id(a)
140714873892592
>>> id(b)
140714810215672
>>> id(c)
140714873954744

However, each element of a, b and c still is a reference to the lists [1,2] and [3,4] we created when defining a. That becomes clear when we mutate an item inside the lists:

>>> c[1][1] = 42
>>> a
[[1, 2], [3, 42]]
>>> b
[[1, 2], [3, 42]]
>>> c
[[1, 2], [3, 42]]

As you can see, the second element of the second list changed in a, b and c. Now, to make a deep copy of a, you have several options. One is a list comprehension where you copy each of the sublists:

>>> d = [sublist[:] for sublist in a]
>>> d
[[1, 2], [3, 42]]
>>> d[1][1] = 23
>>> d
[[1, 2], [3, 23]]
>>> a
[[1, 2], [3, 42]]

As you can see, the 42 in a did not change to 23, because the second list in a and d are different objects:

>>> id(a[1])
140714873800968
>>> id(d[1])
140714810230904

Another way to create a deep copy is with copy.deepcopy:

>>> from copy import deepcopy
>>> e = deepcopy(a)
>>> e
[[1, 2], [3, 42]]
>>> e[1][1] = 777
>>> e
[[1, 2], [3, 777]]
>>> a
[[1, 2], [3, 42]]
timgeb
  • 76,762
  • 20
  • 123
  • 145
  • Thanks for your help. Could you modify your answer to make the following point too? Then I will mark your post as an answer. As @BrenBarn pointed out, the problem was that I made a deep copy of the outer list only. Hence, the cloned list, `copyOfAllCandidatePaths`, was having a shallow copy of the inner lists. As a result, when I removed elements of an inner list (`route`) , the sub-elements of the original list (`allCandidatePaths`) also changed. – Newbie Jan 03 '16 at 11:16
  • @Newbie no problem. I don't feel comfortable editing BrenBarn's answer into my answer because originally I did not make his observation, I just wanted to give you a general overview/reminder. I think your comment on my answer should be enough. – timgeb Jan 03 '16 at 11:25
1

I think you are confused about the terms.

Shallow copy is the type of copy where the elements of the copied list are still bound to same memory value with the original list's elements.

What you are looking for is deepcopy Here is a good source to find out.

And also: Wikipedia

Rockybilly
  • 2,938
  • 1
  • 13
  • 38
0

A shallow copy copies the object but not any of its attributes. For a list, that means that that its elements are assigned to the elements of the new list. If the elements are ints, you get a completely new list, because int is a primitive type*, so it doesn't need to be copied. If the elements are lists, as in your two-dimesional list, they get assigned to the copy, so you have two references to each element, one in each list. If you want to copy the elements of the inner lists, you'll need a deep copy, which recursively copies the attributes (elements, in this case) of each object.

*There isn't actually a distinction between types that are primitive and those that aren't, but this is equivalent to the way it works in the scope of this explanation.

KSFT
  • 1,774
  • 11
  • 17
  • 1
    There is no difference between "primitive types" or non-primitive types with respect to copying. – DSM Jan 02 '16 at 23:50
  • @DSM: Maybe there isn't internally, but `a=1;b=a;a=2;print b` has different output than `a=[1];b=a;a[0]=2;print b[0]`, so I think it's an okay way of explaining it, but I have added a note. – KSFT Jan 03 '16 at 00:43
  • 1
    that's not because of any difference between primitive and non-primitive types, that's because `a = 2` and `a[0] = 2` are different kinds of statement: the first simply means "the name `a` now points to the integer object 2", and the second "set the 0th element of the object named by `a` to 2". That the object named by `a` is also named by `b` doesn't matter. – DSM Jan 03 '16 at 02:31