4

I am reading numpy's documentation and in the section "basic indexing" there is an example that I am struggling to understand.

Documentation link: https://numpy.org/doc/stable/user/basics.indexing.html#basics-indexing

The code:

x = np.arange(0, 50, 10)
array([ 0, 10, 20, 30, 40])
x[np.array([1, 1, 3, 1])] += 1

Returns:

array([ 0, 11, 20, 31, 40])

I am scratching my head because I imagined that the left-hand side created a new array, not a view. And indeed it does:

np.shares_memory(x, x[np.array([1, 1, 3, 1])])

Returns:

False

Thus, I imagined that iadd (+=) was being called on a new array, but that is not the case because somehow this operation is modifying the original array.

If I correctly understood the documentation, x[np.array([1, 1, 3, 1])] triggers a "advanced indexing", which "always returns a copy of the data".

The "question": can someone help me understand that is going on here? A step-by-step description of what is going on would be appreciated.

edit: more specifically, what I imagined would happen:

  1. create a temporary new array with [10, 10, 30, 10]
  2. try assign [10, 10, 30, 10] + 1 to this new array
  3. broadcast 1 to allow the summation
  4. assign the result of the sum, [11, 11, 31, 11], to the temporary array

Since there is no reference to this temporary array I imagined this would effectively be a noop.

Trauer
  • 1,981
  • 2
  • 18
  • 40
  • How would you expect 'x[0]+=1' to behave? Copy scalar from original array? – tstanisl Sep 09 '21 at 21:57
  • My expectation is that "scalar indexing" returns a reference/view and "sequence of integers indexing" returns a copy. Thus, I expect x[0] += 1 to increment the first element. – Trauer Sep 09 '21 at 21:59
  • the np.array in line 3 only select index 1 and 3 so only these elements are incremented. – Oliver Prislan Sep 09 '21 at 22:01
  • @Trauer " I imagined that the left-hand side created a new array, not a view." **it doesn't create anything**. it is part of an item-assignment statement. – juanpa.arrivillaga Sep 10 '21 at 04:19
  • This should probably be closed as a duplicate for: https://stackoverflow.com/questions/10623302/how-does-assignment-work-with-list-slices – juanpa.arrivillaga Sep 10 '21 at 04:21
  • All your 4 steps occur on the RHS. The confusion is over what happens when `[11,11,31,11]` is assigned to `x[np.array([1, 1, 3, 1])]`. `x` itself is modified, with values from the temporary array. – hpaulj Sep 10 '21 at 16:05

2 Answers2

1

It's not made entirely clear but left-of-assignment syntax x[...] = ... does something different (x.__setitem__) from right-of-assignment/normal expression syntax x[...] (x.__getitem__). Generally, __setitem__ modifies specified elements in the original object, while __getitem__ returns a new object accessing the specified elements.

It's not just NumPy arrays. Think of a basic Python list. x[0] extracts an element, but x[0] = 5 changes the list.

Smaller nitpick, you're not talking about basic slicing, x[np.array([1, 1, 3, 1])] is advanced indexing. Basic slicing makes a view that shares data with the original, advanced indexing copies data. In both cases they make a new NumPy array.

BatWannaBe
  • 4,330
  • 1
  • 14
  • 23
  • Indeed. That was a typo! The title of the question was correct but the first paragraph incorrect. I will fix it. – Trauer Sep 09 '21 at 22:01
  • Yet list[slice] =5, which is the closest thing to a numpy integer sequence indexing, does not. – Trauer Sep 09 '21 at 22:10
  • But here is the thing... "advanced indexing" should "always return a copy": https://numpy.org/doc/stable/reference/arrays.indexing.html#advanced-indexing – Trauer Sep 09 '21 at 22:22
  • You're not returning something when `x[...]` is an l-value though are you, you're assigning to it? It would seem that the reference you posted only applies to r-values? I would be interested in the mechanism myself. – David Waterworth Sep 09 '21 at 23:32
  • @Trauer 1) you can't do `list[slice] = 5` because lists didn't implement broadcasting. To be clear, `x[...] = ...` is implemented by the `__setitem__` method while `x[...]` is implemented by `__getitem__`. It's up to those methods to implement features. 2) The NumPy docs is exactly what I was talking about when I said "it's not made entirely clear". It talks about the `__getitem__` part, not so much the `__setitem__` part. – BatWannaBe Sep 09 '21 at 23:51
  • @Trauer try `list[slice] = some_iterable_with_len_slice_elements`, it works just like with numpy.ndarrays – juanpa.arrivillaga Sep 10 '21 at 04:20
  • The question was about '+=' operator, not '=' – tstanisl Sep 10 '21 at 06:08
1

The talk about view versus copy applies to the __getitem__ indexing; not to the __setitem__ (or __iadd__). You are using += with a duplicate index, which complicates the whole process.

In [44]: x = np.arange(0, 50, 10)
In [45]: x
Out[45]: array([ 0, 10, 20, 30, 40])

This does make a copy, a new array with values taken from the x:

In [46]: x[np.array([1, 1, 3, 1])]
Out[46]: array([10, 10, 30, 10])

But this assigns values to elements of x, specifically to 2 elements, indexed with 1 and 3:

In [47]: x[np.array([1, 1, 3, 1])] += 1
In [48]: x
Out[48]: array([ 0, 11, 20, 31, 40])   # only 10 and 30 have incremented

If instead we do the += iteratively, the 10 is incremented 3 times:

In [49]: x = np.arange(0, 50, 10)
In [50]: for i in [1, 1, 3, 1]: x[i] += 1
In [51]: x
Out[51]: array([ 0, 13, 20, 31, 40])

Or we can use np.add.at:

In [57]: x = np.arange(0, 50, 10)
In [58]: np.add.at(x, np.array([1,1,3,1]),1)
In [59]: x
Out[59]: array([ 0, 13, 20, 31, 40])

Its documentation talks about += using buffering, which explains why 10 only becomes 11.

Another example, this time a simple assignment

In [60]: x = np.arange(0, 50, 10)
In [61]: x[np.array([1, 1, 3, 1])] = [100,101,102,103]
In [62]: x
Out[62]: array([  0, 103,  20, 102,  40])    # only 103 remains

In fact the assignment explains the += case

In [63]: x = np.arange(0, 50, 10)
In [64]: x[np.array([1, 1, 3, 1])]+1
Out[64]: array([11, 11, 31, 11])
In [65]: x[np.array([1, 1, 3, 1])] = _  
In [66]: x
Out[66]: array([ 0, 11, 20, 31, 40])

31 is assigned to x[3] and 11 to x[1]

hpaulj
  • 221,503
  • 14
  • 230
  • 353