Possible reasons for overfitting the dataset

Question

The dataset I used contains 33k images. The training contains 27k and validation set contains 6k images.
I used the following CNN code for the model :

model = Sequential()

model.add(Convolution2D(32, 3, 3, activation='relu', border_mode="same", input_shape=(row, col, ch)))
model.add(Convolution2D(32, 3, 3, activation='relu', border_mode="same"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, activation='relu', border_mode="same"))
model.add(Convolution2D(128, 3, 3, activation='relu', border_mode="same"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Activation('relu'))
model.add(Dense(1024))
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Dense(1))
adam = Adam(lr=0.0001)
model.compile(optimizer=adam, loss="mse", metrics=["mae"])

The output I obtain has a decreasing training loss but increasing validation loss suggesting overfitting. But I have included dropouts which should have helped in preventing overfitting.Following is the snap of output when trained for 10 epochs :

Epoch 1/10
27008/27040 [============================>.] - ETA: 5s - loss: 0.0629 - mean_absolute_error: 0.1428 Epoch 00000: val_loss improved from inf to 0.07595, saving model to dataset/-00-val_loss_with_mymodel-0.08.hdf5
27040/27040 [==============================] - 4666s - loss: 0.0629 - mean_absolute_error: 0.1428 - val_loss: 0.0759 - val_mean_absolute_error: 0.1925
Epoch 2/10
27008/27040 [============================>.] - ETA: 5s - loss: 0.0495 - mean_absolute_error: 0.1287 Epoch 00001: val_loss did not improve
27040/27040 [==============================] - 4605s - loss: 0.0494 - mean_absolute_error: 0.1287 - val_loss: 0.0946 - val_mean_absolute_error: 0.2289
Epoch 3/10
27008/27040 [============================>.] - ETA: 5s - loss: 0.0382 - mean_absolute_error: 0.1119 Epoch 00002: val_loss did not improve
27040/27040 [==============================] - 4610s - loss: 0.0382 - mean_absolute_error: 0.1119 - val_loss: 0.1081 - val_mean_absolute_error: 0.2463

So, what is wrong? Are there any other methods to prevent overfitting?
Does shuffling of data help?

You could try increase a `dropout` rate. And add `BatchNormalization` after `dropout`s. — Marcin Możejko, Mar 29 '17 at 19:05
@MarcinMożejko, I added BatchNormalization before Activation as suggested in this [link](http://stackoverflow.com/questions/34716454/where-do-i-call-the-batchnormalization-function-in-keras). The val_loss increased, previous without insertion was 0.06 and with insertion I obtain val_loss of 0.09. — SupposeXYZ, Apr 01 '17 at 05:24

score 1 · Answer 1 · answered Mar 28 '17 at 18:21

1

I would try to add weight decay of 1E-4. This can be done by adding the weight decay layer wise like this: model.add(Convolution2D(32, 3, 3, activation='relu', border_mode="same", input_shape=(row, col, ch), W_regularizer=l2(1E-4), b_regularizer=l2(1E-4))). L2 can be found in keras.regularizers (https://keras.io/regularizers/#example). Weight regularization is very good at combating overfitting.

However overfitting might not only be a result of your model, but also of your model. If the validation data is somehow "harder" then your train data then it might just be that you can not fit it as well.

answered Mar 28 '17 at 18:21

Thomas Pinetz

6,948
2
27
46

I had 33K images as dataset which I shuffled and distributed in 80% training and 20% validation set. Are `regularizer` added to every layer or only while compiling the model i.e. `model.compile()`? – SupposeXYZ Mar 29 '17 at 13:20
Added for every layer. – Thomas Pinetz Mar 29 '17 at 15:02

Possible reasons for overfitting the dataset

1 Answers1