a=b seems to be the identiy, not a copy. Why?

Question

Hello Python/iPython users.

I have found a weird behavior of python using numpy arrays. I found a solution to the problem myself, but I'd love to get an explanation. Thanks in advance.

Here's the problem: Using ipython I create an numpy array a and a copy of a, called b:

import numpy as np
a=np.zeros(5)
b=a

However, b seems to be rather the identity of a and not a copy since changing b changes a as well.

b[0]=1
a
array([ 1.,  0.,  0.,  0.,  0.])

The solution is to use b=a.copy() rather than b=a, but I'd like to understand why this is the case in python. I'm quite familiar with Matlab,R and Fortran and never ran into a problem like this before. Why would someone want to have a second name for the same data instead of a copy of this vector? Just some python-specific syntax thing or is there more to understand?

If you want to really understand what is happening here, read [Facts and myths about Python names and values](http://nedbatchelder.com/text/names.html) by Ned Batchelder. — Tim Pietzcker, Jan 09 '14 at 12:22

score 4 · Answer 1 · answered Jan 09 '14 at 12:17

Its simply a convention of python. All assignments never do anything but create a new handle to an existing object. That's a pretty sensible rule, because it keeps the semantics simple and transparent; in other languages, you may often be left wondering if you are modifying an existing object, or creating a handle to a new one. If you want to do something other than slapping a new name on an existing object, python always forces you to make that explicit. And as to why you would want to do that: try and find any piece of python code, and see how many assignment statements it contains. Apparently there is a use for it ;).

a=b seems to be the identiy, not a copy. Why?

1 Answers1