I'm using the Omniglot dataset, which is a set of 19,280 images, each which is 105 x 105 (grayscale).
I defined a custom Dataset class with the following transform:
class OmniglotDataset(Dataset):
def __init__(self, X, transform=None):
self.X = X
self.transform = transform
def __len__(self):
return self.X.shape[0]
def __getitem__(self, idx):
if torch.is_tensor(idx):
idx = idx.tolist()
img = self.X[idx]
if self.transform:
img = self.transform(img)
return img
img_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
X_train.shape
(19280, 105, 105)
train_dataset = OmniglotDataset(X_train, transform=img_transform)
When I index a single image, it returns the right dimensions:
train_dataset[0].shape
torch.Size([1, 105, 105])
But when I index several images, it returns the dimensions in the wrong order (I expect 3 x 105 x 105
):
train_dataset[[1,2,3]].shape
torch.Size([105, 3, 105])