2

I experienced some unexpected behavior of np.save(). Assume, you want to save two numpy arrays into one .npy file (as an object). As long both arrays have the same shape this works without any problem, but when the leading dimension is the same an error occurs. The problem is caused by np.asanyarray(), which is called in np.save() prior saving. It is clear that one could solve this problem by e.g. saving into different files, but I am not looking for another solution, I want to understand this behavior of np.save().

Here is the code:

import numpy as np
a = np.zeros((10, 5))
b = np.zeros((10, 2))
np.save('test', [a, b])

Causes this error:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/python3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 509, in save
arr = np.asanyarray(arr)
File "/python3/lib/python3.6/site-packages/numpy/core/numeric.py", line 544, in asanyarray
return array(a, dtype, copy=False, order=order, subok=True)
ValueError: could not broadcast input array from shape (10,5) into shape (10)

When the leading dimension is different there is no problem:

a = np.zeros((9, 5))
b = np.zeros((10, 2))
np.save('test', [a, b])

For me this behavior of np.save is inconsistent and seems to be a bug.

Stark
  • 21
  • 1
  • 2
  • Looks like `save` method tries to combine the ndarrays if they have the same first dimension. Interesting. – unholy_me Jun 26 '18 at 11:44

2 Answers2

1

After seeing the source of asanarray method here (save method calls it internally) I see it tries to make a ndarray of the list which is passed using the array method. Now if they have different dimensions, it is able to to make an ndarray with 2 different elements in it. However, if they have the same leading dimension it tries to broadcast them together into a same ndarray. This is because it by default tries to make a high dimensional output. To get around this you can first make use of the empty method to specify dimensions, then use that to substitute the values like:

a=np.zeros((10,5))
b=np.zeros((10,2))
c=[a,b]
finalc = np.empty(len(c),dtype=object)
finalc[:]=c
np.save("file",c)
unholy_me
  • 460
  • 3
  • 14
  • Thanks for your suggestion. However, my problem is not that I have no workaround for this, it is more that my expectations of what `np.save` would do and what happened is inconsistent. It basically implies, that I would need to check prior every saving with `np.save` whether the leading dimension is the same. Moreover, `np.array` fails with an error in merging the two arrays. – Stark Jun 27 '18 at 11:52
  • Basically the same question was already asked here [link](https://stackoverflow.com/questions/35133317/numpy-save-some-arrays-at-once). The conclusion for me is that `np.save` **should** actually only be used for saving one array of fixed dimensions and not for structures of `dtype=object`, although it might work sometimes. – Stark Jun 27 '18 at 17:08
0

Am I do something wrong, I saved "finalc" and not "c". When I tried to save "c" there are the same error.

a=np.zeros((10,5))
b=np.zeros((10,2))
c=[a,b]
finalc = np.empty(len(c),dtype=object)
finalc[:]=c
np.save("file", finalc)

  • This does not really answer the question. If you have a different question, you can ask it by clicking [Ask Question](https://stackoverflow.com/questions/ask). To get notified when this question gets new answers, you can [follow this question](https://meta.stackexchange.com/q/345661). Once you have enough [reputation](https://stackoverflow.com/help/whats-reputation), you can also [add a bounty](https://stackoverflow.com/help/privileges/set-bounties) to draw more attention to this question. - [From Review](/review/late-answers/33732743) – Jurakin Feb 04 '23 at 10:45