What does the 1 in torch.Size([64, 1, 28, 28]) mean when I check a tensor shape?

Question

I'm following this tutorial on towardsdatascience.com because I wanted to try the MNIST dataset using Pytorch since I've already done it using keras.

So in Step 2, knowing the dataset better, they print the trainloader's shape and it returns torch.Size([64, 1, 28, 28]). I understand that 64 is the number of images in that loader and that each one is a 28x28 image but what does the 1 mean exactly?

score 4 · Answer 1 · answered Apr 14 '22 at 05:43

4

It simply defines an image of size 28x28 has 1 channel, which means it's a grayscale image. If it was a colored image then instead of 1 there would be 3 as the colored image has 3 channels such as RGB.

answered Apr 14 '22 at 05:43

AMI

51
2

score 2 · Answer 2 · answered Mar 31 '21 at 10:31

2

It's the number of channels in the input. In the MNIST data set the images are gray scale thus the shape of the image is [28, 28, 1]. Notice that pytorch set the first dimension to the channel dimension.

Of course once loaded as batches the total input shape is the one you are getting.

refer to the MNIST dataset link, where it states:

The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.

answered Mar 31 '21 at 10:31

David

8,113
2
17
36

Sorry, I didn't really understand what you meant because I tried displaying an image using ```plt.imshow(images[0].numpy().squeeze())``` and it showed a picture in colors – tildawn Mar 31 '21 at 10:35
`matplotlib.pyplot.imshow()` infer the image as a heatmap, this is why you get colors. When you do `images[0].numpy().squeeze()` you "remove" the channel dimension so you get a 2-D matrix with values which is a sort of heatmap – David Mar 31 '21 at 10:36
@tildawn you can also refer to https://stackoverflow.com/questions/41265576/plt-imshow-shows-color-images-for-grayscale-images-in-ipython – David Mar 31 '21 at 10:46

score 2 · Answer 3 · answered Mar 31 '21 at 14:19

2

In short ,
Its just the number of channels your 28x28 image has

answered Mar 31 '21 at 14:19

Prajot Kuvalekar

5,128
3
21
32

Lets make it more simple.. torch.Size([ 64, #Batch Size 1, #Color Channel, Since images in the MNIST dataset are grayscale, there's just one channel which is represented as 1. 28, #Rows 28 #Columns ]) Hope this helps. – ZKS Dec 24 '21 at 14:43

score -2 · Answer 4 · answered Mar 31 '21 at 10:33

This would suggest the number of batches present in the dataset. Think of it as groups, so we have 1 batch of 64 images, or you could change that, and say, have 2 batches of 32 images each. The batch size can usually influence the computational complexity for the model. And, of course, depending on the used library (especially in the training/testing loop), the code would look slightly different if you would use just 1 batch, or X number of batches.

For example (the number of epochs/iterations = 50): imagine you are training a dataset of batch size = 1, in the training loop you would just write train the model epoch times. However, for batch size = x, you would have to loop for each epoch as well as for each batch/group.

What does the 1 in torch.Size([64, 1, 28, 28]) mean when I check a tensor shape?

4 Answers4