102

I have managed to load images in a folder using the command line sklearn: load_sample_images()

I would now like to convert it to a numpy.ndarray format with float32 datatype

I was able to convert it to np.ndarray using : np.array(X), however np.array(X, dtype=np.float32) and np.asarray(X).astype('float32') give me the error:

ValueError: setting an array element with a sequence.

Is there a way to work around this?

from sklearn_theano.datasets import load_sample_images
import numpy as np  

kinect_images = load_sample_images()
X = kinect_images.images

X_new = np.array(X)  # works
X_new = np.array(X[1], dtype=np.float32)  # works

X_new = np.array(X, dtype=np.float32)  # does not work
cottontail
  • 10,268
  • 18
  • 50
  • 51
Priya Narayanan
  • 1,197
  • 2
  • 10
  • 16

2 Answers2

141

If you have a list of lists, you only needed to use ...

import numpy as np
...
npa = np.asarray(someListOfLists, dtype=np.float32)

per this LINK in the scipy / numpy documentation. You just needed to define dtype inside the call to asarray.

Thom Ives
  • 3,642
  • 3
  • 30
  • 29
1

If you're converting a list into an array, you'll need to make a new copy anyway, so

arr = np.array(my_list, dtype='float32')

also works.

One use case is when my_list is not actually a list but some other list-like object; in which case explicitly casting to list beforehand might be helpful.

arr = np.array(list(my_list), dtype='float32')

For example,

my_list = pd.Series([[1], [2], [3]]).values

np.array(my_list)             # jagged array; not OK
np.array(list(my_list))       # OK
np.array(my_list.tolist())    # OK


my_list = {'a': [1], 'b': [2]}.values()
np.array(my_list)             # jagged array; not OK
np.array(list(my_list))       # OK

To cast nested list into an array, the shapes of the sublists must match. If they don't maybe you want to concatenate the sublists along some axis. Try np.concatenate/np.r_/np.c_ etc. instead.

cottontail
  • 10,268
  • 18
  • 50
  • 51