Numpy array indexing: view or copy - depends on scope?

Question

Consider the following array manipulations:

import numpy as np
def f(x):
     x += 1
x = np.zeros(1)
f(x)       # changes `x`
f(x[0])    # doesn't change `x`
x[0] += 1  # changes `x`

Why does x[0] behave differently depending on whether += 1 happens inside or outside the function f?

Can I pass a part of the array to the function, such that the function modifies the original array?

Edit: If we considered = instead of +=, we would probably maintain the core of the question while getting rid of some irrelevant complexity.

becasue wehn you pass x[0] , you are just passing that value ( as an object) not the whole x object — eshirvana, Dec 21 '21 at 21:37
@eshirvana If the *passing* makes a copy, why does `f(x)` modify `x`? If the *indexing* makes a copy, why does `x[0] += 1` modify the original? If *passing* makes a copy only if there was *indexing*, how does passing know whether there was indexing or not, and why was this dependency implemented? — root, Dec 21 '21 at 22:01
`x` is mutable, `x[0]` is not. It means `__iadd__` can modify x but not `x[0]`. This is not really about indexing. If `x[0]` were mutable and had an `__iadd__` method defined to modify it in place, that would change too. Try it with a two dimensional array and you will see that passing `x[0]` will change its first row. — ayhan, Dec 21 '21 at 22:15
@ayhan If `x[0]` is not mutable and `__iadd__` (same as `+=`?) cannot modify it, then why does `x[0]+=1` modify it? Also, why does Python consider `x[0]` not mutable even though it is possible to modify it directly (by modifying the first entry of `x`)? — root, Dec 21 '21 at 22:19
Yes, `__iadd__` is `+=`. When you do `x[0] = 1` or `x[0] = x[0] + 1` or `x[0] += 1`, and x[0] is an integer, you are not really modifying the integer there but instead pointing it to a different integer. Ned Batchelder has a great [article](https://nedbatchelder.com/text/names1.html) about this. — ayhan, Dec 21 '21 at 22:27
@root *passing never makes a copy*. When you *index into a numpy array*, a *entirely new object* is created. Check what happens if you do `x[0] is x[0]`. If `x[0]` is a scalar value, it won't modify the original at all — juanpa.arrivillaga, Dec 21 '21 at 23:57
@juanpa.arrivillaga Why does `x[0] += 1` change the array, even though it accesses the array `x` only by indexing, and you write "When you *index into a numpy array*, a *entirely new object* is created"? Probably because there are two different kinds of indexing: `__getitem__` and `__setitem__` (the latter is when the indexing happens on the left of an equals sign, so to speak), as per [this answer](https://stackoverflow.com/a/70444042/5231110). — root, Dec 25 '21 at 23:34
@root yes, that's it. `__getitem__` returns a new object, `__setitem__` mutates the underlying buffer... You have to understand these are basically just sugar for calls to methods. They can, in principle, do *anything* and they don't have to be consistent with each other — juanpa.arrivillaga, Dec 25 '21 at 23:38
@juanpa.arrivillaga The fact that they don't have to be consistent with each other is very interesting and not necessarily expected/intuitive. Are there articles with nice examples? — root, Dec 26 '21 at 00:02
@root just *implement a class* with both and see for youreslf. — juanpa.arrivillaga, Dec 26 '21 at 00:28

hpaulj · Answer 1 · 2021-12-25T23:38:35.620

You don't even need the function call to see this difference.

x is an array:

In [138]: type(x)
Out[138]: numpy.ndarray

Indexing an element of the array returns a np.float64 object. It in effect "takes" the value out of the array; it is not a reference to the element of the array.

In [140]: y=x[0]
In [141]: type(y)
Out[141]: numpy.float64

This y is a lot like a python float; you can += the same way:

In [142]: y += 1
In [143]: y
Out[143]: 1.0

but this does not change x:

In [144]: x
Out[144]: array([0.])

But this does change x:

In [145]: x[0] += 1
In [146]: x
Out[146]: array([1.])

y=x[0] does a x.__getitem__ call. x[0]=3 does a x.__setitem__ call. += uses __iadd__, but it's similar in effect.

Another example:

Changing x:

In [149]: x[0] = 3
In [150]: x
Out[150]: array([3.])

but attempting to do the same to y fails:

In [151]: y[()] = 3
Traceback (most recent call last):
  File "<ipython-input-151-153d89268cbc>", line 1, in <module>
    y[()] = 3
TypeError: 'numpy.float64' object does not support item assignment

but y[()] is allowed.

basic indexing of an array with a slice does produce a view that can be modified:

In [154]: x = np.zeros(5)
In [155]: x
Out[155]: array([0., 0., 0., 0., 0.])
In [156]: y= x[0:2]
In [157]: type(y)
Out[157]: numpy.ndarray
In [158]: y += 1
In [159]: y
Out[159]: array([1., 1.])
In [160]: x
Out[160]: array([1., 1., 0., 0., 0.])

===

Python list and dict examples of the x[0]+=1 kind of action:

In [405]: alist = [1,2,3]
In [406]: alist[1]+=12
In [407]: alist
Out[407]: [1, 14, 3]
In [408]: adict = {'a':32}
In [409]: adict['a'] += 12
In [410]: adict
Out[410]: {'a': 44}

__iadd__ can be thought of a __getitem__ followed by a __setitem__ with the same index.

Why does `x[0] += 1` change the array, even though it accesses the array `x` only by indexing, and you write that indexing "'takes' the value out of the array; it is not a reference to the element of the array"? Probably because there are two different kinds of indexing `__getitem__` and `__setitem__` (the latter is when the indexing happens on the left of an equals sign, so to speak), as per [this answer](https://stackoverflow.com/a/70444042/5231110). — root, Dec 25 '21 at 23:35
This `+=` action isn't unique to arrays. I added examples with list and dict. — hpaulj, Dec 25 '21 at 23:41

Mad Physicist · Accepted Answer · 2021-12-26T02:22:40.660

The issue is not scope, since the only thing that depends on scope is the available names. All objects can be accessed in any scope that has a name for them. The issue is one of mutability vs immutability and understanding what operators do.

x is a mutable numpy array. f runs x += 1 directly on it. += is the operator that invokes in-place addition. In other words, it does x = x.__iadd__(1)^*. Notice the reassignment to x, which happens in the function. That is a feature of the in-place operators that allows them to operate on immutable objects. In this case, ndarray.__iadd__ is a true in-place operator which just returns x, and everything works as expected.

Now let's analyze f(x[0]) the same way. x[0] calls x.__getitem__(0)^*. When you pass in a scalar int index, numpy extracts a one-element array and effectively calls .item() on it. The result is a python int (or float, or even possibly a tuple, depending on what your array's dtype is). Either way, the object is immutable. Once it's been extracted by __getitem__, the += operator in f replaces the name x in f with the new object, but the change is not seen outside the function, much less in the array. In this scenario, f has no reference to x, so no change is to be expected.

The example of x[0] += 1 is not the same as calling f(x[0]). It is equivalent to calling x.__setitem__(0, x.__getitem__(0).__iadd__(1))^*. The call to f was only the part with type(x).__getitem__(0).__iadd__(1), which returns a new object, but never reassigns as __setitem__ does. The key is that [] = (__setitem__) in python is an entirely different operator from [] (__getitem__) and = (assingment) separately.

To make the second example (f(x[0]) work, you would have to pass in a mutable object. An integer object extracts a single python object, and an array index makes a copy. However, a slice index returns a view that is mutable and tied to the original array memory. Therefore, you can do

f(x[0:1])  # changes `x`

In this case f does the following: x.__getitem__(slice(0, 1, None)).__iadd__(1). The key is that __getitem__ returns a mutable view into the original array, not an immutable int.

To see why it is important not only that the object is mutable but that it is a view into the original array, try f(x[[0]]). Indexing with a list produces an array, but a copy. In x[[0]].__iadd__ will modify the list you pass in in-place, but the list is not copied back into the original, so the change will not propagate.

^* This is an approximation. When invoked by an operator, dunder methods are actually called as type(x).__operator__(x, ...), not x.__operator__(...).

You write "The key is that `[] =` (`__setitem__`) in python is an entirely different operator from `[]` (`__getitem__`) and `=` (assingment) separately." Do you recall a nice explanation of this, for example in the Python documentation? — root, Dec 25 '21 at 23:27
This sentence is important, please make it bold in the answer: "The key is that `[] =` (`__setitem__`) in python is an entirely different operator from `[]` (`__getitem__`) and `=` (assingment) separately." (I can't suggest an edit because the suggested-edit queue is full.) Thanks! :) — root, Dec 25 '21 at 23:28
@root. That's just the definition of the method and my personal interpretation. This insight was very helpful to me when I was first learning, so I'm happy to bold it for you. — Mad Physicist, Dec 26 '21 at 02:17

score 0 · Answer 3 · answered Dec 26 '21 at 00:13

As per this comment and this answer:

The x[0] inside of f(x[0]) performs __getitem__ on x. In this particular case (as opposed to indexing a slice of the array, for example), the value returned by this operation doesn't allow modifying the original array.
x[0] = 1 performs __setitem__ on x.

__getitem__ and __setitem__ can be defined/overloaded to do anything. They don't even have to be consistent with each other.

Numpy array indexing: view or copy - depends on scope?

3 Answers3