Despite employing Dropout, MaxPooling, Early Stopping, and Regularizers, my CNN model is still overfitting. How can I further prevent overfitting?

Question

As the title clearly describes the situation I'm experiencing, despite employing Dropout, MaxPooling, EarlyStopping and Regularizers, my CNN model is still overfitting. Also, I've experimented with various learning_rate, dropout_rate, and L1/L2 regularization weight decay. How can I further prevent overfitting?

Here is the model (using Keras on TensorFlow backend):

batch_size = 128
num_epochs = 200
weight_decay = 1e-3
num_filters = 32 * 2
n_kernel_size = 5
num_classes = 3
activation_fn = 'relu'
nb_units = 128
last_dense_units = 128
n_lr = 0.001
n_momentum = 0.99
n_dr = 0.00001
dropout_rate = 0.8

model.add(Embedding(nb_words, EMBEDDING_DIM, input_length=max_seq_len))
model.add(Dropout(dropout_rate))
model.add(Conv1D(num_filters, n_kernel_size, padding='same', activation=activation_fn,
                 kernel_regularizer=regularizers.l2(weight_decay)))
model.add(MaxPooling1D())
model.add(GlobalMaxPooling1D())
model.add(Dense(128, activation=activation_fn, kernel_regularizer=regularizers.l2(weight_decay)))
model.add(Dropout(dropout_rate))
model.add(Dense(num_classes, activation='softmax'))

adam = Adam(lr=n_lr, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=n_dr)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['acc'])

early_stopping = EarlyStopping(
    monitor='val_loss',
    patience=3,
    mode='min',
    verbose=1,
    restore_best_weights=True
)

model.fit(...)

Here's the accuracy plots of training and validation:

So, what's the definition of this situation? And how can I make the validation accuracy closer to the training accuracy? — talha06, Jan 27 '21 at 00:39
This is called "generalization gap" - see (own) answers [here](https://stackoverflow.com/a/61043883/4685471) and [here](https://stackoverflow.com/a/58468274/4685471). As for how we close it, well, this is exactly the billion dollar question...! — desertnaut, Jan 27 '21 at 01:22

raceee · Answer 1 · 2021-01-26T20:45:03.013

0

There are still overfitting methods to try:

Nested K-folds
Simply making another test set outside of training and validation
Introducing more data
Have you done any data augmentation?

Your model does seem to be overfitting by about 10%. But how much overfitting is too much overfitting? I would look to this post and related conversation so you can best evaluate your specific situation.

edited Jan 26 '21 at 20:45

answered Jan 26 '21 at 20:33

raceee

477
5
14

Yes, exactly, this 10% is critical for my research. #2 was already done as my test set is separated from the training and validation sets. #3 & #4 have not been applied hence the input is text. – talha06 Jan 26 '21 at 20:48
If you input is text, data augmentation is still an option see https://arxiv.org/pdf/1901.11196.pdf – raceee Jan 26 '21 at 20:53

Despite employing Dropout, MaxPooling, Early Stopping, and Regularizers, my CNN model is still overfitting. How can I further prevent overfitting?

1 Answers1