The dilemma of overfitting in NN training

Question

My question is in continuation to the one asked by another user: What's is the difference between train, validation and test set, in neural networks?

Once learning is over by terminating when the minimum MSE is reached by looking at the validation and train set performance (easy to do so using nntool box in Matlab), then using the trained net structure if the performance of the unseen test set is slightly poor than the training set we have an overfitting problem. I am always encountering this case eventhough the model for which during learning the parameters corresponding to validation and train set having nearly same performance is selected. Then how come the test set performance is worse than the train set?

score 1 · Accepted Answer · answered Jun 26 '19 at 06:41

Training data= Data we use to train our model.

Validation data= Data we use to test our model on every-epoch or on run-time So that we can early stop our model manually because of over-fitting or any other model. Now Suppose I am running 1000 epochs on my model and on 500 epochs I view that my model is giving 90% accuracy on training data and 70% accuracy on validation data. Now I can see that my model is over-fitting. I can manually stop my training and before 1000 epochs complete and tune my model more and than see the behavior.

Testing data= Now after completing my training on model after computing 1000 epochs. I will predict my test data and see the accuracy on test data. its giving 86%

My training accuracy is 90% validation accuracy is 87% and testing accuracy is 86%. this may vary because data in validation set, training set and testing set are totally different. We have 70% samples in training set 10% validation and 20% testing set. Now on my validation my model is predicting 8 images correctly and on testing my model predicting 18 images correctly out of 100. Its normal in real life projects because pixels in every image are varying form the other image thats why a little difference may happen.

In testing set their are more images than validation set that may be one reason. Because more the images more the risk of wrong prediction. e.g on 90% accuracy my model predict 90 out of 100 correctly but if I increase the image sample to 1000 than my model may predict (850, 800 or 900) images correctly out 1000 on

Thank you very much for explaining nicely. If I have followed you correctly, you say that it is quite normal to have testing accuracy < training accuracy. I have followed the same logic as you explained in the paragraph for validation definition stopped the training of model once val and train accuracy are quite close to each other. In my case, the mse (for a regression problem) in training is 0.29 using 800 samples and testing using 200 samples is 0.26. So, even though the model is overfitting I should accept this model and result? How to prevent this?Can you plz clarify if I understood you. — Sm1, Jun 26 '19 at 12:51
No its ok with your model if it is giving 0.26 mse on testing. and their is very less over-fitting which we can ignore so. but If you want to over-come this over-fitting that might be in result of getting high error. because by using over-fitting techniques our model need more time to converge weights or sometime due to applying more over-fitting techniques over model start to under fit the data. So we usually ignore when their is very small over-fitting. — Sohaib Anwaar, Jun 26 '19 at 13:01
Thanks once again. I have posted a new question (https://stackoverflow.com/questions/56778505/analyzing-nn-performance-in-matlab-how-is-the-test-set-used-in-training-differ) which is related to this one. Since you have answered I thought it may be easier for you to relate to the new question since it is in continuation to the one answered by you. Can you please take a look and will be grateful for your help again. — Sm1, Jun 26 '19 at 17:49

The dilemma of overfitting in NN training

1 Answers1