Measuring performance of the classifiers in imbalanced datasets

Question

I am trying to do classification over an imbalnced dataset (2000 data-points from positive class and 98880 data-points from negative class). I use Precision, Recall, F-Score and AUC to report the models performacne but the way that these models behave made me suprised. You can see the models results in the following:

TP:1982, TN:87920, FP:10960, FN:18 | PR:0.153, RE:0.991, F1:0.265, AUC:0.972
TP:22, TN:98877, FP:3, FN:1978 | PR:0.880, RE:0.011, F1:0.022, AUC:0.810
TP:148, TN:98271, FP:609, FN:1852 | PR:0.196, RE:0.074, F1:0.107, AUC:0.700
TP:1611, TN:98847, FP:33, FN:389 | PR:0.980, RE:0.805, F1:0.884, AUC:0.998

As you can see,

In the first model, the precision is very low and recall is very high, which leads to low F-Score and high AUC.
In the second model, the precision is high and the recall is low, but the results is similar, high AUC and low F-Score.
In the third model, both precison and reacall are very low which results low F-Score, but suprisingly AUC is still fairly high
In the fourth model, the precision and recall are high, therefore the F-Score and AUC are high

So, can I conclude, for my problem F-Score is a better performance metric than AUC ?

Not a *coding/programming* question, hence arguably off-topic here; better suited for [Cross Validated](https://stats.stackexchange.com/help/on-topic). — desertnaut, Feb 25 '19 at 10:51
Keep in mind that, in contrast to your other metrics, AUC is actually *not* measuring the performance of a single classifier, but the "average" performance of your model over all possible thresholds; see [High AUC but bad predictions with imbalanced data](https://stackoverflow.com/questions/51190809/high-auc-but-bad-predictions-with-imbalanced-data/51192702#51192702) and [Getting a low ROC AUC score but a high accuracy](https://stackoverflow.com/questions/47104129/getting-a-low-roc-auc-score-but-a-high-accuracy/47111246#47111246) for more... — desertnaut, Feb 25 '19 at 10:54

Measuring performance of the classifiers in imbalanced datasets

0 Answers0