1

I know that tuple assignment in Python such as the following

a, b = b, a

works by first packing b and a into a tuple (b, a) and then unpacking it to a and b, so as to achieve swapping two names in one statement. But I found that if a and b are replaced by sliced Numpy arrays:

# intended to swap the two halves of a Numpy array
>>> import numpy as np
>>> a = np.random.randn(4)
>>> a
array([-0.58566624,  1.42857044,  0.53284964, -0.67801528])
>>> a[:2], a[2:] = a[2:], a[:2]
>>> a
array([ 0.53284964, -0.67801528,  0.53284964, -0.67801528])

My guess is that the packed tuple (a[2:], a[:2]) is actually a tuple of "pointers" to the memory location of a[2] and a[0]. When unpacking, first a[:2] is overwritten by values starting from the memory at a[2]. Then a[2:] is overwritten by values starting from a[0]. Is this the correct understanding of the mechanism?

ihdv
  • 1,927
  • 2
  • 13
  • 29
  • 1
    "My guess is that the packed tuple (a[2:], a[:2]) is actually a tuple of "pointers" to the memory location of a[2] and a[0]" no. First of all, *Python doesn't have pointers*. The expression `a[2:], a[:2]` creates a tuple of the result of *slicing the `nd.array` objects*. This creates *new array objects*, but they are *views* over the underlying buffer from the original array. Basically, you are sharing the same underlying mutable, primitive buffer. – juanpa.arrivillaga Aug 30 '21 at 02:32
  • But basically, your understanding of what is going on is correct. First the tuple on the right hand side is evaluated, and the assignment targets on the left hand side are done in order from left to right – juanpa.arrivillaga Aug 30 '21 at 02:32
  • 1
    The internal mechanism of the assignment `a, b = b, a` in CPython is explained [here](https://stackoverflow.com/q/21047524/15187728). For slices it works essentially the same way. As @juanpa.arrivillaga wrote for numpy arrays the difference is that slices are views of the original array. – bb1 Aug 30 '21 at 02:40
  • @juanpa.arrivillaga I suppose the key is that in expressions of the type `a[:] = ...` the lvalue is specified to be an `ndarray` object, and the assignment is the one defined specifically for the `ndarray` class, i.e., copy into its underlying buffer if possible, instead of the normal name-object binding in Python. Right? – ihdv Aug 30 '21 at 02:46
  • @ihdv this isn't simple assignment, name-object binding applies to *simple* assignment, i.e. `a = b`, if you do`a[ix] = b` then the exact behavior is deferred to the type, i.e., it is equivalent to `type(a).__setitem__(a, ix, b)`. Numpy arrays are basically object-oriented wrappers over primitive, C-like arrays, implementing a "true" multidimensional array interface. In this context, the key thing to understand is that different `numpy` array objects can *share* underlying buffers. – juanpa.arrivillaga Aug 30 '21 at 03:01
  • Now, *simple slicing* always returns a view in numpy, but numpy extends slice notation to accept and handle various other cases that mostly *don't* return views but rather independent, fresh array objects that don't share the buffer – juanpa.arrivillaga Aug 30 '21 at 03:02
  • @juanpa.arrivillaga Thanks! Your explanation is very clear. – ihdv Aug 30 '21 at 03:11
  • 1
    @juanpa.arrivillaga I think it's worth writing up as an answer, if there isn't a clear duplicate. It doesn't work this way for `list`s, after all (since the right-hand side actually creates new objects rather than views in that case). – Karl Knechtel Aug 30 '21 at 03:16
  • @KarlKnechtel moved some stuff over to an answer. – juanpa.arrivillaga Aug 30 '21 at 03:26

1 Answers1

2

So, this isn't simple assignment. Name-object binding semantics apply to simple assignment, i.e. a = b.

If you do:

a[ix] = b

Then the exact behavior is deferred to the type, i.e., it is equivalent to

type(a).__setitem__(a, ix, b)

Numpy arrays are basically object-oriented wrappers over primitive, C-like arrays, implementing a "true" multidimensional array interface. In this context, the key thing to understand is that different numpy array objects can share underlying buffers. Simple slicing always creates a numpy.ndarray object that is a view over the original array.

So in this case, the b above is actually a call to nd.array.__getitem__. Which returns a view.

So, consider the simple case of Python lists. The right hand side:

(a[2:], a[:2]) 

Creates a tuple of two, independent list objects (although, shallow-copied).

When they are assigned to the sequence of assignment targets on the left-hand side, the mutation doesn't have any shared effect. There are three independent buffers for the three list objects (list objects will not create views).

On the other hand, the expression a[2:], a[:2] creates a tuple with the result of slicing the original nd.array object, controlled by nd.array.__getitem__. This creates two new array objects, but they are views over the underlying buffer from the original array. Basically, you are sharing the same underlying mutable, primitive buffer between three different array objects.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
  • Basic slicing with `__getitem__` (right hand side) returns a view, but do you know if `__setitem__` (left hand side) actually creates a view? It certainly doesn't return one, and it seems possible to me that they're just iterating through indices of the existing array. – BatWannaBe Aug 30 '21 at 03:35
  • @BatWannaBe I added some clarification, yes, `__getitem__` is also invovled here, which is what creates the actual view. that is how the slicing is handled on the right hand side – juanpa.arrivillaga Aug 30 '21 at 03:38
  • `__setitem__` does not create or return `view`. It sets values. – hpaulj Aug 30 '21 at 04:34