1

I cannot figure out a bug in a very simple transition from a for-loop to a vectorized numpy operation. The code is the following

 for null_pos in null_positions:
      np.add(singletree[null_pos, parent.x, :, :],
             posteriors[parent.u, null_pos, :, :],
             out=singletree[null_pos, parent.x, :, :])

Since it is a simple addition between 2D matrices, I generalise into a 3D addition

 np.add(singletree[null_positions, parent.x, :, :],
             posteriors[parent.u, null_positions, :, :],
             out=singletree[null_positions, parent.x, :, :])

The thing is, it appears the result is different! Can you see why?

Thanks!

Update: It seems that

singletree[null_positions, parent.x, :, :] = \
             posteriors[parent.u, null_positions, :, :] +
             singletree[null_positions, parent.x, :, :]

solves the problem. In what does this differ with respect to the add operation? (apart from allocating a new matrix, I'm interested in the semantic aspects)

diningphil
  • 416
  • 5
  • 18
  • What is the shape of the input arrays? – Daniel F Sep 04 '17 at 15:23
  • You can assume: singletree is a L x M x C x (C+1) matrix, posteriors are N x L x C x (C+1). Look at my update (which I'll post in a minute) because it does not depend on dimensions, I believe – diningphil Sep 04 '17 at 15:25

1 Answers1

1

The problem is that passing out=singletree[null_positions, parent.x, :, :] is making a copy of the portion of singletree, since you are using advanced indexing (as opposed to basic indexing, which returns views). Hence, the result will be written to an entirely different array and the original one will remain unmodified.

However, you can use advanced indexing to assign values. In you case, the most recommendable syntax would be:

singletree[null_positions, parent.x, :, :] += \
    posteriors[parent.u, null_positions, :, :]

Which would minimize the use of intermediate arrays.

jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • @diningphil Doesn't seem silly at all to me, your code was perfectly reasonable, but it just "happens to work" or not depending on what exactly you use to index the arrays... I mean there's a reason to it but this kind of pitfalls are more frequent than one would like with NumPy. – jdehesa Sep 04 '17 at 16:57