Keras: Training loss decrases (accuracy increase) while validation loss increases (accuracy decrease)

Question

I am working on a very sparse dataset with the point of predicting 6 classes. I have tried working with a lot of models and architectures, but the problem remains the same.

When I start training, the acc for training will slowly start to increase and loss will decrease where as the validation will do the exact opposite.

I have really tried to deal with overfitting, and I simply cannot still believe that this is what is coursing this issue.

What have I tried

Transfer learning on VGG16:

exclude top layer and add dense layer with 256 units and 6 units softmax output layer
finetune the top CNN block
finetune the top 3-4 CNN blocks

To deal with overfitting I use heavy augmentation in Keras and dropout after the 256 dense layer with p=0.5.

Creating own CNN with VGG16-ish architecture:

including batch normalization wherever possible
L2 regularization on each CNN+dense layer
Dropout from anywhere between 0.5-0.8 after each CNN+dense+pooling layer
Heavy data augmentation in "on the fly" in Keras

Realising that perhaps I have too many free parameters:

decreasing the network to only contain 2 CNN blocks + dense + output.
dealing with overfitting in the same manner as above.

Without exception all training sessions are looking like this: Training & Validation loss+accuracy

The last mentioned architecture looks like this:

    reg = 0.0001

    model = Sequential()

    model.add(Conv2D(8, (3, 3), input_shape=input_shape, padding='same',
            kernel_regularizer=regularizers.l2(reg)))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(Dropout(0.7))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.5))

    model.add(Conv2D(16, (3, 3), input_shape=input_shape, padding='same',
            kernel_regularizer=regularizers.l2(reg)))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(Dropout(0.7))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.5))

    model.add(Flatten())
    model.add(Dense(16, kernel_regularizer=regularizers.l2(reg)))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    model.add(Dropout(0.5))

    model.add(Dense(6))
    model.add(Activation('softmax'))

    model.compile(loss='categorical_crossentropy', optimizer='SGD',metrics=['accuracy'])

And the data is augmented by the generator in Keras and is loaded with flow_from_directory:

    train_datagen = ImageDataGenerator(rotation_range=10,
                                width_shift_range=0.05,
                                height_shift_range=0.05,
                                shear_range=0.05,
                                zoom_range=0.05,
                                rescale=1/255.,
                                fill_mode='nearest',
                                channel_shift_range=0.2*255)
    train_generator = train_datagen.flow_from_directory(
                train_data_dir,
                target_size=(img_width, img_height),
                batch_size=batch_size,
                shuffle = True,
                class_mode='categorical')

    validation_datagen = ImageDataGenerator(rescale=1/255.)
    validation_generator = validation_datagen.flow_from_directory(
                                            validation_data_dir,
                                            target_size=(img_width, img_height),
                                            batch_size=1,
                                            shuffle = True,
                                            class_mode='categorical')

Can you show the outputs of your metrics when fitting your model? So we can see the behavior you describe. — DarkCygnus, Nov 13 '17 at 21:05
@DarkCygnus Should be an image available here: https://i.stack.imgur.com/Vnwhi.png (also present in the post) — Jesper, Nov 13 '17 at 21:15
I see, working on an answer. What is your input shape? (Your pictures size) — DarkCygnus, Nov 13 '17 at 21:19
@DarkCygnus the input shape at this moment is (512,512,3). However, it has been almost anything from 128 to 512 when training previous models. — Jesper, Nov 13 '17 at 21:25
@AlexOtt this is an example of a training image: https://imgur.com/a/1hxQd Here it have been cropped to only fit the wrenches. The purpose is then to train a classifier to tell on which position wrench of size 19 is. On this image it should then output (0,0,1,0,0,0) since wrench 19 is on position 3. Here: https://imgur.com/a/om0m4 is an example of the "raw" image (uncropped to the fit only the wrenches) that would be fed into the model at the end (used for validation). — Jesper, Nov 13 '17 at 21:27
No, I mean do you use ImageDataGenerator + flow_from_directory? If yes, show code for ImageDataGenerator? — Alex Ott, Nov 13 '17 at 21:28
@Jesper for validation data you must not use data augmentation! There is a separate note about it in "DL in Python" book from fchollet... Just define `valid_datagen = ImageDataGenerator(rescale=1./255)` — Alex Ott, Nov 13 '17 at 21:33

DarkCygnus · Accepted Answer · 2017-11-13T23:47:52.387

What I can think of by analyzing your metric outputs (from the link you provided):

Seems to me that approximately near epoch 30 your model is starting to overfit. Therefore you can try stopping your training in that iteration, or well just train it for ~30 epochs (or the exact number). The Keras Callbacks may be useful here, specially the ModelCheckpoint to enable you to stop your training when desired (Ctrl +C) or when certain criteria is met. Here is an example of basic ModelCheckpoint use:

#save best True saves only if the metric improves
chk = ModelCheckpoint("myModel.h5", monitor='val_loss', save_best_only=False) 
callbacks_list = [chk]
#pass callback on fit
history = model.fit(X, Y, ... , callbacks=callbacks_list)

(Edit:) As suggested in comments, another option you have available is to use the EarlyStopping callback, where you can specify the minimum change tolerated and the 'patience' or epochs without such improvement before stopping the training. If using this, you have to pass it to the callbacks argument as explained before.

At the current setup you model has (and with the modifications you have tried) that point in your training seems to be the optimal training time for your case; training it further will bring no benefits to your model (in fact, will make it generalize worse).

Given you have tried several modifications, one thing you can do is to try to increase your Network Depth, to give it more capacity. Try adding more layers, one at a time, and check for improvements. Also, you usually you want to start with simpler models first, before attempting a multi-layer solution.

If a simple model doesn't work, add one layer and test again, repeating until satisfied or possible. And by simple I mean really simple, have you tried a non-convolutional approach? Although CNN are great for images, maybe you are overkilling it here.

If nothing seems to work, maybe it is time to get more data, or to generate more data from the one you have by sampling or other techniques. For that last suggestion, try checking this keras blog I have found really useful. Deep learning algorithms usually require substantial amount of training data, specially for complex models, like images, so be aware this may not be an easy task. Hope this helps.

@AlexOtt thanks for the suggestion, editing the answer to include such option :) — DarkCygnus, Nov 13 '17 at 21:56
I'll mark this as answered - thank you for your good advises both @AlexOtt and you. I have tried what you suggested and the trend does not change. The train loss will decrease and the val loss will increase. I get a max accuracy on the val set of some 45-ish%. — Jesper, Nov 15 '17 at 21:20
@Jesper Did you tried all what I suggested (more data, depth,..)? Another thing that could be acting strangely is your data augmentation. Could probably be saturating your performance up to a point where augmentation brings no further benefit (what if you do it without augmentation? thats what I meant when suggesting to get more data, organic samples and not artificial ones). You can ping me if you want for any further discussion if you like. Cheers — DarkCygnus, Nov 15 '17 at 21:25
Yes, I did try to vary the network size. Both to very simple and deeper models. More details about the project follows below: For starters the training data is images like this one, obtained in different scenery with different ligthing conditions, etc: https://imgur.com/mmlNqEi After training here, an attentionmap revieled that almost all attention was given to the background. To overcome this, all train images were cropped to only fit the wrenches, like this: https://imgur.com/upp51pA Now, it's better, but the attentionmap still reviels some problem w.r.t. focusing on the wrenches themself — Jesper, Nov 15 '17 at 21:30
The idea is to predict the location of wrench of size 19. I.e. when it is located on the seconds position from the left, the model should output [0,1,0,0,0,0]. — Jesper, Nov 15 '17 at 21:34
@Jesper maybe you should try playing with `adam` and/or `sgd` with `lr` > 0.01 (which is default). Or if you did already, how was the performance? — bit_scientist, Jan 06 '20 at 07:43

score 2 · Answer 2 · answered Nov 13 '17 at 21:07

2

IMHO, this is just normal situation for DL. In Keras you can setup a callback that will save the best model (depending on evaluation metric that you provide), and callback that will stop training if model isn't improving.

See ModelCheckpoint & EarlyStopping callbacks respectively.

P.S. Sorry, maybe I misunderstood question - do you have validation loss decreasing form first step?

answered Nov 13 '17 at 21:07

Alex Ott

80,552
8
87
132

As shown in plot (link should be available in the post) the loss slightly decreases at the very beginning and then starts increasing. The accuracy for the validation does not change much overall. Using the weights from the first few epochs would not make much sense here, as the network would not have learning sufficient enough. – Jesper Nov 13 '17 at 21:17

score 2 · Answer 3 · answered Jan 02 '20 at 21:27

Validation loss is increasing. This means you need more data, or more regularization. Standard situation here, and nothing to be worried about. By the way, more parameters (bigger model) is just going to worsen this problem unless you fix it.

So you can now investigate profitably by introducing more examples, L2, L1, or dropout.

score 1 · Answer 4 · answered Jan 13 '21 at 14:54

I faced a similar problem and managed to fix it by removing the Batch Normalisation layer that's just before the output dense layer. This made a ton of difference. Also one of the suggestions I was given is to remove the Dropout layer as it might be causing Shift Variance. Check this paper

I got part of the solution from this thread.

Keras: Training loss decrases (accuracy increase) while validation loss increases (accuracy decrease)

What have I tried

4 Answers4