In numpy, q1 = p[:] instead of q1 = p, yet p is modified when I modify q1?

Question

I am getting baffled by how copying a Numpy array works in Python. I start with the following:

import numpy as np
p = np.array([1.0, 0.0, 1.0, 0.3])

Then I try to make "copies" of p using the following three methods:

q = p
q1 = p[:]
q2 = p.copy()

Now I execute q1[2] = 0.2, and then check the values of q, q1, and q2. I was surprised to find that p, q, and q1 all changed to array([1.0, 0.0, 0.2, 0.3]), while only q2 remains invariant. I have also used id() to check the address of all four variables (p, q, q1, q2), and have confirmed that id(p) = id(q), but id(q1) != id(p).

My question is, if id(q1) != id(p), how can a modification of q1 alters p and q? Thanks!

@byxor Thanks for your quick reply! Then what does this ID refer to? I thought id(x) checks the memory location of x, no? — Xiao, Mar 26 '20 at 13:27
`[...][:]` makes a (shallow) copy of the list because that's how `list.__getitem__` is defined. `np.array.__getitem__` is defined differently. — chepner, Mar 26 '20 at 13:44
@Xiao Correct. Although there's nothing to stop 2 objects with separate memory addresses from modifying 1 piece of shared memory. E.g. if 2 instances of a class contain a reference to a list, the instances will have separate IDs but they will both manipulate the same underlying memory (of the list). — byxor, Mar 26 '20 at 13:50

score 8 · Answer 1 · answered Mar 26 '20 at 13:27

8

The documentation of Numpy states:

All arrays generated by basic slicing are always views of the original array.

Therefore q1 in your case is a view of p and reflects the changes made to p.

answered Mar 26 '20 at 13:27

Jacques Gaudin

15,779
10
54
75

score 5 · Answer 2 · answered Mar 26 '20 at 13:27

5

Because you are using a simple slicing operation, numpy will use a shared memory view of the resulting slice of the array. In this case it is the entire array. They are referenced by different python objects, but the underlying numpy array is the same. q1 is just a view into the same array that p is referencing.

You can check this using np.shared_memory.

import numpy as np
p = np.array([1.0, 0.0, 1.0, 0.3])

q1 = p[:]

np.shares_memory(p, q1)
# returns:
True

This is even true when the slice is not of the entire array. Such as:

p = np.array([1.0, 0.0, 1.0, 0.3])

q2 = p[1::2]
print(q2)
#prints:
[0.  0.3]

# setting a value of q2 changes p
q2[0] = 10.0
p
# returns:
array([ 1. , 10. ,  1. ,  0.3])

answered Mar 26 '20 at 13:27

James

32,991
4
47
70

Thanks for your reply! So in order to create a truly independent copy of p, I should use p.copy() then? – Xiao Mar 26 '20 at 13:33
Yes. That is the best method. – James Mar 26 '20 at 13:35
Thanks for your reply! A follow-up question: In terms of making an independent copy of p (so that changes to q2 will not affect p), is q2 = p.copy() enough? Do I ever need q2 = p.deepcopy() to make sure q2 is independent of p? – Xiao Mar 26 '20 at 13:41
Copy should be enough unless it is a numpy array of nested python objects. – James Mar 27 '20 at 14:16
Thanks for your reply! It's very helpful for me:) – Xiao Mar 27 '20 at 16:20

In numpy, q1 = p[:] instead of q1 = p, yet p is modified when I modify q1?

2 Answers2