2

code A:

lst = [1, 2, 3]
for i in range(10):
    lst+= ["42"]

code B:

lst = [1, 2, 3]
for i in range(10):
    lst = lst + ["42"]

I know the output is the same, but is there a difference in the way the two lists are built? What's happening in the back actually?

Daniel Gagnon
  • 199
  • 1
  • 6

1 Answers1

5

When you do

lst += ["42"]

You are mutating lst and appending "42" at the end of it. But when you say,

lst = lst + ["42"]

You are creating a new list with lst and "42" and assigning the reference of the new list to lst. Try this program to understand this better.

lst = ["1"]
print(id(lst))
lst += ["2"]
print(id(lst))
lst = lst + ["3"]
print(id(lst))

The first two ids will be the same, bu the last one will be different. Because, a new list is created and lst now points to that new list.

Not knowing the difference between these two will create a problem, when you pass a list as a parameter to a function and appending an item to it, inside the function like this

def mutate(myList):
    myList = myList + ["2"] # WRONG way of doing the mutation
tList = ["1"]
mutate(tList)
print(tList)

you will still get ['1'], but if you really want to mutate myList, you could have done like this

def mutate(myList):
    myList += ["2"] # Or using append function
tList = ["1"]
mutate(tList)
print(tList)

will print ['1', '2']

thefourtheye
  • 233,700
  • 52
  • 457
  • 497
  • Do you happen to know whether the Python language definition *guarantees* that the id will change when you do `lst = lst + ["3"]`, or if that's left to the implementation? I ask because in principle a reference-counting implementation of Python might notice that `lst` has a refcount of 1 at the start of that operation and is being assigned back to its only referrer, and therefore as an optimization want to re-use the object. Of course it can re-use the underlying memory used for the array, but I wonder whether or not the language permits it to re-use the `list` itself too. – Steve Jessop Nov 22 '13 at 09:44
  • @SteveJessop As far as I know, implementation of `id` is python implementation dependent. But, in CPython, I know for sure that `id` will change when we do that operation. Because, in CPython, `id` returns the memory address of the object. – thefourtheye Nov 22 '13 at 09:46
  • With the hypothetical optimization I describe above, the address of the new list would be the same as the old. So the reason the id changes on CPython isn't that CPython uses the address as the id, it's that it doesn't do the optimization. `id` must return different values for different objects, my question is whether a Python implementation is permitted to in effect give the new object the same address as the old object that is destroyed in the same statement (because its refcount hits 0). – Steve Jessop Nov 22 '13 at 10:32