6

I need a good explanation (reference) to explain NumPy slicing within (for) loops. I have three cases.

def example1(array):
    for row in array:
        row = row + 1
    return array

def example2(array):
    for row in array:
        row += 1
    return array

def example3(array):
    for row in array:
        row[:] = row + 1
    return array

A simple case:

ex1 = np.arange(9).reshape(3, 3)
ex2 = ex1.copy()
ex3 = ex1.copy()

returns:

>>> example1(ex1)
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

>>> example2(ex2)
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

>>> example3(ex3)
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

It can be seen that the first result differs from the second and third.

blaz
  • 4,108
  • 7
  • 29
  • 54
  • 2
    Related question, hope this helps http://stackoverflow.com/questions/15376509/when-is-i-x-different-from-i-i-x-in-python/15376520#15376520 – Thiru Mar 17 '16 at 09:31

2 Answers2

4

First example:

You extract a row and add 1 to it. Then you redefine the pointer row but not what the array contains! So it will not affect the original array.

Second example:

You make an in-place operation - obviously this will affect the original array - as long as it is an array.

If you were doing a double loop it wouldn't work anymore:

def example4(array):
    for row in array:
        for column in row:
            column += 1
    return array

example4(np.arange(9).reshape(3,3))
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

this doesn't work because you don't call np.ndarray's __iadd__ (to modify the data the array points to) but the python int's __iadd__. So this example only works because your rows are numpy arrays.

Third example:

row[:] = row + 1 this is interpreted as something like row[0] = row[0]+1, row[1] = row[1]+1, ... again this works in place so this affects the original array.

Bottom Line

If you are operating on mutable objects, like lists or np.ndarray you need to be careful what you change. Such an object only points to where the actual data is stored in memory - so changing this pointer (example1) doesn't affect the saved data. You need to follow the pointer (either directly by [:] (example3) or indirectly with array.__iadd__ (example2)) to change the saved data.

Community
  • 1
  • 1
MSeifert
  • 145,886
  • 38
  • 333
  • 352
3

In the first code, you don't do anything with the new computed row; you rebind the name row, and there is no connection to the array anymore.

In the second and the third, you dont rebind, but assign values to the old variable. With += some internal function is called, which varies depending on the type of the object you let it act upon. See links below.

If you write row + 1 on the right hand side, a new array is computed. In the first case, you tell python to give it the name row (and forget the original object which was called row before). And in the third, the new array is written to the slice of the old row.

For further reading follow the link of the comment to the question by @Thiru above. Or read about assignment and rebinding in general...

Ilja
  • 2,024
  • 12
  • 28