Accuracy of model got stuck at 50% while training an Age and Gender detection model

Question

I was working through the Keras implementation of Age and Gender Detection model described in the research paper Age and Gender Classification using Convolutional Neural Networks'. It was originally a Caffe model but I thought to convert it to Keras. But while I was training the model, the accuracy of the model got stuck around 49 - 52%. It means that the model is not learning at all. Also, the loss can be seen exponentially increasing and at times becomes nan. I was training on google collab with GPU hardware accelerator.

My input was a folder of images whose labels are in its file name.I loaded all the images as a numpy array and labels were a collection of 10 elements (2 for gender and 8 classes for 8 different age groups as described in the paper).

model = Sequential()
model.add(Conv2D(96,(7,7),
                 activation= 'relu',
                 strides= 4,
                 use_bias= 1,
                 bias_initializer= 'Zeros',
                 data_format= 'channels_last',
                 kernel_initializer = RandomNormal(stddev= 0.01),
                 input_shape= (200,200,3)))
model.add(MaxPooling2D(pool_size= 3,
                       strides= 2))
model.add(BatchNormalization())

model.add(Conv2D(256,(5,5),
                 activation= 'relu',
                 strides= 1,
                 use_bias= 1,
                 data_format= 'channels_last',
                 bias_initializer= 'Ones',
                 kernel_initializer = RandomNormal(stddev= 0.01)
                 ))
model.add(MaxPooling2D(pool_size= 3,
                       strides= 2))
model.add(BatchNormalization())

model.add(Conv2D(384,
                 (3,3),
                 strides= 1,
                 data_format= 'channels_last',
                 use_bias= 1,
                 bias_initializer= 'Zeros',
                 padding= 'same',
                 kernel_initializer = RandomNormal(stddev= 0.01),
                 activation= 'relu'))
model.add(MaxPooling2D(pool_size= 3,
                       strides= 2))

model.add(Flatten())
model.add(Dense(512,
                use_bias= 1,
                bias_initializer= 'Ones',
                kernel_initializer= RandomNormal(stddev= 0.05),
                activation= 'relu'))
model.add(Dropout(0.5))

model.add(Dense(512,
                use_bias= 1,
                bias_initializer= 'Ones',
                kernel_initializer= RandomNormal(stddev= 0.05),
                activation= 'relu'))
model.add(Dropout(0.5))

model.add(Dense(10,
                use_bias= 1,
                kernel_initializer= RandomNormal(stddev= 0.01),
                bias_initializer= 'Zeros',
                activation= 'softmax'))

model.compile(loss= 'categorical_crossentropy', metrics= ['accuracy'], optimizer= SGD(lr= 0.0001, decay= 1e-7, nesterov= False))
model.summary()

Inputs to the model were shuffled:

X_train, X_test, y_train, y_test = train_test_split(images,labels,test_size= 0.2,shuffle= True, random_state= 42)

You can see my training results here I have used correct optimizers and correct initializers along with biases to prevent vanishing gradients.

you have taken care of everything but failed to notice that the `loss` had become `nan` from second epoch itself. Solving this might also help you get better accuracy. — learner, Apr 26 '20 at 11:54
0 I was gonna suggest the same. Keep track of your loss and the reasons why you get nan on epochs 2+. This thread might be a good place to start searching: https://stackoverflow.com/questions/61416197/pretraining-a-language-model-on-a-small-custom-corpus — inverted_index, Apr 26 '20 at 19:50
Is your target one-hot encoded? Can you show all your 10 labels. — , Apr 27 '20 at 11:23
My labels is of the format : y = ['Male','Female','0 – 2', '4 – 6', '8 – 12', '15 – 20', '25 – 32', '38 – 43','48 – 53', '60 – 100']. It is in the form 0/1. — Aditya Gupta, Apr 28 '20 at 06:15
I tried using Adam optimizer and using tanh activation but no prgress. I can't figure why my loss is nan. — Aditya Gupta, Apr 28 '20 at 06:31
However, when I trained a multi-output network, One output layer for age and another for gender, then the gender output was learning and it was fine but the age validation accuracy is stuck at 50 %. — Aditya Gupta, Apr 28 '20 at 06:35

score 1 · Accepted Answer · answered May 05 '20 at 09:06

Would suggest to follow the below approach to improve the accuracy of the model -

Build two different models, one for the Gender prediction and another for the Age prediction.
Use Label Encoder or One hot encoder on the target variables.
For Gender Predcition model use Binary crossentrpy as loss function.
For Age prediction model use Categorical crossentropy(if you have used Label Encoder for target variable) or sparse categorical crossentropy(if you used one hot encoder for target variable).
Before building the model normalize all the numerical data.
Use softmax in the final layer as activation function and relu in remaining layers.
Also instead of 2 hidden dense layers, keep just 1(more dense layer means more weight to learn, you can experiment with the number of layers and filters).

Hope I have answered your question. Happy Learning!

@Aditya Gupta - Can you please accept and upvote the answer if it answers your question. Thank You. — , May 07 '20 at 18:59

Accuracy of model got stuck at 50% while training an Age and Gender detection model

1 Answers1