2

If I have a list x = [1, 2, 3] and pass it to a function f which uses the operator += in the form f(x[:]), a shallow copy is made and the contents are unchanged:

def f(x):
    print "f, x = ", x, ", id(x) = ", id(x)
    x += [1]
    print "f, x = ", x, ", id(x) = ", id(x)

x = [1,2,3]
print "x = ", x, ", id(x) = ", id(x)
f(x[:])
print "x = ", x, ", id(x) = ", id(x)

Output:

x =  [1, 2, 3] , id(x) =  139701418688384
f, x =  [1, 2, 3] , id(x) =  139701418790136
f, x =  [1, 2, 3, 1] , id(x) =  139701418790136
x =  [1, 2, 3] , id(x) =  139701418688384

However, expecting the same behavior for an ndarray x = np.array([1, 2, 3]) I was surprised that the contents were changed, even though a copy was indeed made:

import numpy as np

def f(x):
    print "f, x = ", x, ", id(x) = ", id(x)
    x += [1]
    print "f, x = ", x, ", id(x) = ", id(x)

x = np.array([1,2,3])
print "x = ", x, ", id(x) = ", id(x)
f(x[:])
print "x = ", x, ", id(x) = ", id(x)

Output:

x =  [1 2 3] , id(x) =  139701418284416
f, x =  [1 2 3] , id(x) =  139701418325856
f, x =  [2 3 4] , id(x) =  139701418325856
x =  [2 3 4] , id(x) =  139701418284416

(I know the +[1] function acts differently for an ndarray vs a list). How can I pass an ndarray like the list and avoid this behavior?

Bonus question Why is the problem resolved by using x = x + [1] in the function f?

bcf
  • 2,104
  • 1
  • 24
  • 43

1 Answers1

3

You can use the copy method of the numpy-array if you want a copy:

f(x.copy())

Note that even though the id of x and x[:] differs these arrays may share the same memory so changes to one will propagate to the other and vice-versa:

x = np.array([1,2,3])
y = x[:]
np.may_share_memory(x, y)   # True

z = x.copy()
np.may_share_memory(x, z)   # False

However normally you don't pass copies to a function. You would create a copy inside the function:

def give_me_a_list(lst):
    lst = list(lst)  # makes a shallow copy
    # ...


def give_me_an_array(arr):
    arr = np.array(arr)  # makes a copy at least if you don't pass in "copy=False".
    # ...
MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • Thanks. The introspection from iPython says `id` is the memory address of the variable, though...am I missing something? Also, I added a bonus question---could you help with that? – bcf Feb 27 '17 at 23:03
  • @bcf For the bonus question you might want to have a look my answer to this other question: [What is the difference between i = i + 1 and i += 1](http://stackoverflow.com/a/41446882/5393381). – MSeifert Feb 27 '17 at 23:05
  • @bcf The `id` is the memory adress of the instance. But the actual array of a numpy-array is stored as attribute (a bit more complicated because ndarray is implemented as C class). What you've got is something like a view, which is basically another instance that shares some/all of it's memory with the original. – MSeifert Feb 27 '17 at 23:07
  • 1
    @bcf well, it's complicated like this because `numpy` essentially wraps C arrays and makes them object-oriented. – juanpa.arrivillaga Feb 27 '17 at 23:21
  • @bcf Well, Python is just not comparable to C++ (or Java, ...) and then NumPy arrays are just not comparable to lists. As for the `x = x + 1` and `x += 1` that's similar in almost all programing languages. – MSeifert Feb 27 '17 at 23:24