First of all,I am confused about why we use the loss to update the model but use the metrics to choose the model we need.
Maybe not all of code, but most of the code I've seen does,they use EarlyStopping to monitor the metrics on the validation data to find the best epoch(loss and metrics are different).
Since you have chosen to use the loss to update the model, why not use the loss to select the model? After all, the loss and the metrics are not exactly the same. It gives me the impression that you do something with this purpose, and then you evaluate it with another indicator, which makes me feel very strange.Take the regression problem as an example,when someone use the 'mse' as their loss, why they define
metrics=['mae']
and monitor this to early stop or reduce learning rate,I just can't understand and I want to know what is the advantages of doing this?Secondly, when your training data is imbalance data and the problem is a classfication problem, Some of the tutorial will tell you to use the F1 or AUC as your metrics,and they say it will improve the problem caused by imbalance data.I don't know why these metrics can improve the problem caused by imbalance data.
Thirdly,I am confused about when someone send more than one metric to the parameter
metrics
in the functioncompile
. I don't understand why multiple, why not one. What is the advantage of defining multiple metrics over one?I seem to have too many questions,and they have been bothering me for a long time.
Thank you for your kind answer.
The content above is what I edited before. Some people think my questions are too broad, so I want to reorganize my language.
Now suppose that there is a binary classification problem, and the data is not balanced. The ratio of positive and negative classes is 500:1.
I chose DNN
as my classification model. I chose cross entropy
as my loss
.
Now the question is whether I should choose cross entropy
as my metric
, or should I choose something else, why?
I want to talk about the information I get from other people's answers, that is, when the problem is a regression problem, the general metric and loss are differentiable, so in fact, choosing the same metrice and loss, or different one, depends entirely on your own understanding of the problem. But if the problem is classification, the metric we want is not differentiable, so we will choose different loss and metric, such as F1
and AUC
, which are not differentiable. Why don't we choose cross entropy
directly as the measure?