How to improve cats and dogs classification using CNN with pytorch

Question

I tried to follow the architecture of CNN in this paper, ImageNet Classification with Deep Convolutional Neural Networks (https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf). In this paper, they tried to classify 1000 classes, whereas I am just trying to classify 2 classes.

But, my test accuracy got stuck at 50%, and the model is not learning.

I am training with 23K images of cats and dogs, and test with 2500 images.

This is URL to my notebook https://github.com/jinglescode/workspace/blob/master/my-journey-computer-vision/codes/Cats_and_Dogs.ipynb

Could anyone advise what's wrong? What have I missed out? Willing to learn.

normalisation will help a lot to get better result. I think you should also think again about you'r data augmentation: i'm not sure doing horizontal flip or 30° rotation helps — akhetos, Aug 27 '19 at 08:43
@akhetos thanks for your input, Shai and you mentioned about normalisation, and I did it. see my updated notebook [https://github.com/jinglescode/workspace/blob/master/my-journey-computer-vision/codes/Cats_and_Dogs.ipynb]. i also have removed the augmentation to make things simpler. any more ideas? — Jingles, Aug 27 '19 at 10:39
i don't know pytorch syntax and no time to understand it, but the first thing is to check at you'r model shape. In keras this can be done with `model.summary()`. If you still have 50% accuracy after normalisation,main problem is probably coming from you'r model architecture not learning rate — akhetos, Aug 27 '19 at 10:51

score 0 · Answer 1 · edited Jun 20 '20 at 09:12

Normalize your data!!

For image data, you can use the recommended transform to your train and test datasets

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                     std=[0.229, 0.224, 0.225])
train_transforms = transforms.Compose([transforms.RandomRotation(30),
                                       transforms.RandomResizedCrop(224),
                                       transforms.RandomHorizontalFlip(),
                                       transforms.ToTensor(),
                                       normalize]) 

test_transforms = transforms.Compose([transforms.Resize(255),
                                      transforms.CenterCrop(224),
                                      transforms.ToTensor(),
                                      normalize])

Additional comments:

If you want to "peek" into a dataloader, instantiating next(iter(dataloader)) is not a good idea. Instead, you can access the dataset stored inside the dataloader and use its __getitem__:
```
 images, labels = dataloader.dataset[0]
```
If your training is "stuck", the usual first reaction is to change the learning rate

thanks @Shai, I have normalised the data like you said, and used `for images, labels in trainloader` to iterate through dataset. updated notebook [https://github.com/jinglescode/workspace/blob/master/my-journey-computer-vision/codes/Cats_and_Dogs.ipynb] in the paper, they mentioned that the learning rate was divided by 10 when the validation did not improve. whereas I'm using Adam (starting from 0.01). any thoughts about this? — Jingles, Aug 27 '19 at 10:43

How to improve cats and dogs classification using CNN with pytorch

1 Answers1

Normalize your data!!