24

I am new with Pytorch and not very expert in CNN. I have done a successful classifier with the tutorial that they provide Tutorial Pytorch, but I don't really understand what I am doing when loading the data.

They do some data augmentation and normalisation for training, but when I try to modify the parameters, the code does not work.

# Data augmentation and normalization for training
# Just normalization for validation
data_transforms = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

Am I extending my training dataset? I don't see the data augmentation.

Why if I modify the value of transforms.RandomResizedCrop(224) the data loading stop working?

Do I need to transform as well the test dataset?

I am a bit confused with this data transformation that they do.

nbro
  • 15,395
  • 32
  • 113
  • 196
carioka88
  • 345
  • 1
  • 2
  • 6
  • You do not put what is the error you are getting. I suspect that if you change the size of the resulting image of `RandomResizedCrop` your model will crash when flattening the features between the convolutional and the full-connected part. – Manuel Lagunas Apr 24 '18 at 15:15

2 Answers2

43

transforms.Compose just clubs all the transforms provided to it. So, all the transforms in the transforms.Compose are applied to the input one by one.

Train transforms

  1. transforms.RandomResizedCrop(224): This will extract a patch of size (224, 224) from your input image randomly. So, it might pick this path from topleft, bottomright or anywhere in between. So, you are doing data augmentation in this part. Also, changing this value won't play nice with the fully-connected layers in your model, so not advised to change this.
  2. transforms.RandomHorizontalFlip(): Once we have our image of size (224, 224), we can choose to flip it. This is another part of data augmentation.
  3. transforms.ToTensor(): This just converts your input image to PyTorch tensor.
  4. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]): This is just input data scaling and these values (mean and std) must have been precomputed for your dataset. Changing these values is also not advised.

Validation transforms

  1. transforms.Resize(256): First your input image is resized to be of size (256, 256)
  2. transforms.CentreCrop(224): Crops the center part of the image of shape (224, 224)

Rest are the same as train

P.S.: You can read more about these transformations in the official docs

layog
  • 4,661
  • 1
  • 28
  • 30
  • Could you please elaborate why do we not resize to 224,224 but to 256 and then center crop? – Jjang May 06 '20 at 13:56
  • I am not quite sure why it is being done this way in this example. This can be due to particularities of the data. From the network's perspective, it expects input of size `(224, 224)`, no matter how the input is transformed. – layog May 07 '20 at 15:41
  • So all the transformations in the compose are surely applied to every sample during one iteration(or during one batch)? Or is it like any one(or few) of the transformation is(are) applied during one iteration? – Ajinkya Ambatwar May 10 '21 at 11:02
  • All the transformations in compose are applied to all the inputs – layog May 10 '21 at 11:57
1

For ambiguities about data augmentation, I would refer you to this answer:

Data Augmentation in PyTorch

But in short, assume you only have random horizontal flipping transform, when you iterate through a dataset of images, some are returned as original and some are returned as flipped(The original images for the flipped ones are not returned). In other words, the number of returned images in one iteration is the same as the original size of the dataset and is not augmented.

Ashkan372
  • 1,063
  • 1
  • 7
  • 7