23

In Python I can exchange 2 variables by mean of multiple affectation; it works also with lists:

l1,l2=[1,2,3],[4,5,6]
l1,l2=l2,l1
print(l1,l2)
>>> [4, 5, 6] [1, 2, 3]

But when I want to exchange 2 rows of a numpy array (for example in the Gauss algorithm), it fails:

import numpy as np
a3=np.array([[1,2,3],[4,5,6]])
print(a3)
a3[0,:],a3[1,:]=a3[1,:],a3[0,:]
print(a3)
>>> [[1 2 3]
     [4 5 6]]
    [[4 5 6]
     [4 5 6]]

I thought that, for a strange reason, the two columns were now pointing to the same values; but it's not the case, since a3[0,0]=5 after the preceeding lines changes a3[0,0] but not a3[1,0].

I have found how to do with this problem: for example a3[0,:],a3[1,:]=a3[1,:].copy(),a3[0,:].copy() works. But can anyone explain why exchange with multiple affectation fails with numpy rows? My questions concerns the underlying work of Python and Numpy.

JPG
  • 2,224
  • 2
  • 15
  • 15

1 Answers1

38

This works the way you intend it to:

a3[[0,1]] = a3[[1,0]]

The two separate assignments in the tuple assignment are not buffered with respect to eachother; one happens after the other, leading the overwriting your observe

Eelco Hoogendoorn
  • 10,459
  • 1
  • 44
  • 42
  • +1 Nice that what is a nuisance when you try to do `a[[0, 1, 1]] += 1` and the item at position `1` only gets incremented once can be used to your advantage to swap rows. – Jaime Jan 22 '14 at 16:44
  • Yeah; it took me a while to come to appreciate the logic behind that. So happy with the new np.add.at; infact im writing a piece of code with it right now, that used to be a horridly slow python loop. – Eelco Hoogendoorn Jan 22 '14 at 17:43
  • For sums, `np.bincount` is typically much faster than `np.add.at`, 50x in this I just tried: `a = np.zeros((1000,), dtype=np.intp); b = np.random.randint(1000, size=100000); c = np.random.randint(1000000, size=100000); In [10]: %timeit np.add.at(a, b, c); 100 loops, best of 3: 19.3 ms per loop; In [11]: %timeit a + np.bincount(b, weights=c, minlength=1000); 1000 loops, best of 3: 451 µs per loop`. – Jaime Jan 22 '14 at 19:03
  • Of course, readability suffers badly, and it is hard to think of situations where this operation would be a bottleneck. But if you want raw performance, then that seems the way to go. – Jaime Jan 22 '14 at 19:06
  • That highly surprises me. Youd think the implementation of np.add.at would be the most basic loop imaginable. That said, it also generalizes to scattering to nd-arrays, so maybe that adds some overhead? But I cant at all imagine where the factor 40 is coming from. – Eelco Hoogendoorn Jan 22 '14 at 19:35