1

This question involves np.save and np.load best practices. Since the newer numpy version 1.16.3, the default in np.load is set to allow_pickle=False.

After saving a list, the further load declaration works just fine with the default allow_pickle=False:

>> x = [0, 1, 2]                            
>> np.save('my_x_list.npy', x)                
>> loaded_x = np.load('my_x_list.npy') 
>> loaded_x                        
Out: array([0, 1, 2])

The same holds for a numpy array:

>> y = np.arange(10)                            
>> np.save('my_y_numpy_array.npy', y)                
>> loaded_y = np.load('my_y_numpy_array.npy') 
>> loaded_y                          
Out: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

However, a dictionary yields this error:

>> mydict = {'a': 4, 'b': 5}
>> np.save('my_dict.npy', mydict)
>> loaded_z = np.load('my_z_dict.npy')
ValueError: Object arrays cannot be loaded when allow_pickle=False

As far as I understand, dictionaries, lists and numpy arrays are all Object arrays. Hence, one would expect numpy arrays or lists to raise this error as well. Why is this error raised with dictionaries and is not raised with numpy arrays or lists ?

DavidC.
  • 669
  • 8
  • 26
  • The default `allow_pickle` was changed for safety reasons. It reduces the chances of loading something bad from the file. You have to explicitly allow a pickle load for a file from a trusted source. Without pickle, the `load` can only be a safe, numeric (or string) array. – hpaulj May 04 '20 at 00:41

1 Answers1

3

As far as I understand, dictionaries, lists and numpy arrays are all Object arrays

No, it depends on the data type of the values in the list. The reason why you are encountering this error is because you are trying to create a numpy.array from a dict object, which will always give an “Object Array”, that is to say a numpy.array with dtype=object. See

>>> import numpy as np
>>> np.array({'a': 4, 'b': 5})
array({'a': 4, 'b': 5}, dtype=object)

Whereas, when using a list of numbers (integers, floats, complex numbers, etc.) to create a numpy.array, that array will have a number dtype, that does not require to be pickled here.

>>> np.array([1, 2, 3]).dtype
dtype('int64')

You can load a dictionary (or even other objects) from a file into a numpy.array, using the allow_pickle parameter, e.g.

np.load('dictionary.npy', allow_pickle=True)
michaeldel
  • 2,204
  • 1
  • 13
  • 19
  • Thank you for answer. Thank you for your comment. I understand the default `allow_pickle=False` was set for security reasons. In this example, the `npy` file is generated from my script every time it runs; is there a risk for loading wrong data in this scenario ? – DavidC. May 23 '20 at 21:43
  • Using `pickle` is fine to me in a local-experimention scenario. Using it in production might be fine to transfert Python objects between your services (with precautions). See https://stackoverflow.com/questions/21752259/python-why-pickle for more insights on that matter. – michaeldel May 24 '20 at 06:38