20

I am getting the error TypeError: pic should be PIL Image or ndarray. Got <class 'numpy.ndarray'> when I try to load a non-image dataset through the DataLoader. The versions of torch and torchvision are 1.0.1, and 0.2.2.post3, respectively. Python's version is 3.7.1 on a Windows 10 machine.

Here is the code:

class AndroDataset(Dataset):
    def __init__(self, csv_path):
        self.transform = transforms.Compose([transforms.ToTensor()])

        csv_data = pd.read_csv(csv_path)

        self.csv_path = csv_path
        self.features = []
        self.classes = []

        self.features.append(csv_data.iloc[:, :-1].values)
        self.classes.append(csv_data.iloc[:, -1].values)

    def __getitem__(self, index):
        # the error occurs here
        return self.transform(self.features[index]), self.transform(self.classes[index]) 

    def __len__(self):
        return len(self.features)

And I set the loader:

training_data = AndroDataset('android.csv')
train_loader = DataLoader(dataset=training_data, batch_size=batch_size, shuffle=True)

Here is the full error stack trace:

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm 2018.1.2\helpers\pydev\pydevd.py", line 1758, in <module>
    main()
  File "C:\Program Files\JetBrains\PyCharm 2018.1.2\helpers\pydev\pydevd.py", line 1752, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files\JetBrains\PyCharm 2018.1.2\helpers\pydev\pydevd.py", line 1147, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2018.1.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/talha/Documents/PyCharmProjects/DeepAndroid/deep_test_conv1d.py", line 231, in <module>
    main()
  File "C:/Users/talha/Documents/PyCharmProjects/DeepAndroid/deep_test_conv1d.py", line 149, in main
    for i, (images, labels) in enumerate(train_loader):
  File "C:\Users\talha\Documents\PyCharmProjects\DeepAndroid\venv\lib\site-packages\torch\utils\data\dataloader.py", line 615, in __next__
    batch = self.collate_fn([self.dataset[i] for i in indices])
  File "C:\Users\talha\Documents\PyCharmProjects\DeepAndroid\venv\lib\site-packages\torch\utils\data\dataloader.py", line 615, in <listcomp>
    batch = self.collate_fn([self.dataset[i] for i in indices])
  File "C:/Users/talha/Documents/PyCharmProjects/DeepAndroid/deep_test_conv1d.py", line 102, in __getitem__
    return self.transform(self.features[index]), self.transform(self.classes[index])
  File "C:\Users\talha\Documents\PyCharmProjects\DeepAndroid\venv\lib\site-packages\torchvision\transforms\transforms.py", line 60, in __call__
    img = t(img)
  File "C:\Users\talha\Documents\PyCharmProjects\DeepAndroid\venv\lib\site-packages\torchvision\transforms\transforms.py", line 91, in __call__
    return F.to_tensor(pic)
  File "C:\Users\talha\Documents\PyCharmProjects\DeepAndroid\venv\lib\site-packages\torchvision\transforms\functional.py", line 50, in to_tensor
    raise TypeError('pic should be PIL Image or ndarray. Got {}'.format(type(pic)))
TypeError: pic should be PIL Image or ndarray. Got <class 'numpy.ndarray'>
talha06
  • 6,206
  • 21
  • 92
  • 147

4 Answers4

15

This happens because of the transformation you use:

self.transform = transforms.Compose([transforms.ToTensor()])

As you can see in the documentation, torchvision.transforms.ToTensor converts a PIL Image or numpy.ndarray to tensor. So if you want to use this transformation, your data has to be of one of the above types.

Miriam Farber
  • 18,986
  • 14
  • 61
  • 76
  • I see as the error description confirms what you say. But my data is already an instance of `ndarray` as the error description says as well. – talha06 Jun 24 '19 at 20:07
  • @talha06 what is the shape and type of your data? It should be three dimansional, and based on the documentation, numpy.ndarray should have dtype = np.uint8 – Miriam Farber Jun 24 '19 at 20:18
  • It is two dimensional and its shape is `(5289, 38)`. The `dtype` is `int64` as I have not explicitly cast it to be. Just read the `csv` data into a `dataframe`, then took its value as `ndarray`. – talha06 Jun 24 '19 at 21:13
  • Ok so looks like you'll need to change the type and the shape to meet the requirements from the documentation (so add a channel dimension, can be just one channel based on your shape) – Miriam Farber Jun 24 '19 at 21:31
10

Expanding on @MiriamFarber's answer, you cannot use transforms.ToTensor() on numpy.ndarray objects. You can convert numpy arrays to torch tensors using torch.from_numpy() and then cast your tensor to the required datatype.


Eg:

>>> import numpy as np
>>> import torch
>>> np_arr = np.ones((5289, 38))
>>> torch_tensor = torch.from_numpy(np_arr).long()
>>> type(np_arr)
<class 'numpy.ndarray'>
>>> type(torch_tensor)
<class 'torch.Tensor'>
Vishnu Dasu
  • 533
  • 5
  • 18
10
tf=transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((512,640)),
    transforms.ToTensor()
])

it works for me.

Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
Felix
  • 101
  • 1
  • 3
8

If you want to use torchvision.transforms on a numpy array, first convert the numpy array to a PIL Image object using transforms.ToPILImage()

Yash Bhalgat
  • 151
  • 1
  • 3