2

I try to train a model for a binary classification problem with the images being infrared (temperatures) with one-channel. After converting them to three channels (by replicate the 3rd channel) I tried two CNN architecture, VGG-11, and VGG-16, but didn't manage to get a stable training(low accuracy, and after 2-10 epochs(depending on learning rate adjustment) loss freezes in some value.

Standard VGG architecture is used except from AdaptiveAvgPool2d() which is first used in order to alleviate inputs with an arbitrary size. The input size of images is 340x340.

CrossEntropyLoss() is used with labels [0,1] output from the aforementioned network given. Also, the model is trained from scratch(because of the data's nature).
Any idea for improving my architecture in the needs of my problem? I haven't found many works on infrared-image classification, so any help would be highly appreciated.

Shai
  • 111,146
  • 38
  • 238
  • 371
singa1994
  • 747
  • 1
  • 7
  • 18
  • "I try to train a model for a binary classification problem with the images being infrared (temperatures) with one-channel. After converting them to three channels (by replicate the 3rd channel)" - what do you mean? You wrote that you have one channel image, replicate the 3rd channel in this one channel image, and get the three channel image. – user31264 Jan 13 '20 at 09:31
  • 1
    Why are you training from scratch, I don't see why this might be a good idea, even if the images you have are to different from imagenet, using pretrained weights will be in general better then random intialization. What is exactly the task are you doing, are you sure that there is a relation between the images and the labels ? – hola Jan 13 '20 at 09:38
  • you can check this : https://stackoverflow.com/questions/55894132/how-to-correct-unstable-loss-and-accuracy-during-training-binary-classificatio – Mehdi Hamzeloee Jan 13 '20 at 13:17

0 Answers0