Which threshold does h2o.predict() use on new testing set?

Question

I have read several threads on here in regards to h2o.predict() and h2o.performance() differences (as seen from link below).

How to interpret the probabilities (p0, p1) of the result of h2o.predict()

Can someone tell me which threshold does h2o.predict() use? Is it max f1? If so, is it the threshold from training data, validation data, or cross validation?

I tried to use the validation threshold using max f1 and max f0point5 on the testing set (completely separate from training and validation data) but the predicted class from h2o.predict() and the class from using the threshold doesn't match completely.

The closest one I got is to use max f0point5 threshold from training and apply it to testing set.

There is not much documentation on h2o.predict. Also, is there a best practice for threshold, i.e. mean threshold of validation and training, etc?

Thanks in advance!

score 8 · Accepted Answer · answered Dec 14 '18 at 23:40

Here are the specifics of how the prediction threshold is selected when a user runs h2o.predict() or .predict():

1) if you train a model with only training data - the Max F1 threshold from the train data model metrics is used.

2) if you train a model with train and validation data - the Max F1 threshold from the validation data model metrics is used.

3) if you train a model with train data and set the nfold parameter - the Max F1 threshold from the train data model metrics is used.

4) if you train a model with the train data, validation data and set the nfold parameter - the Max F1 threshold from the validation data model metrics is used.

Thanks for the detailed breakdown Lauren :). – Mario Huang Dec 18 '18 at 17:22 — Mario Huang, Dec 18 '18 at 17:22

Which threshold does h2o.predict() use on new testing set?

1 Answers1

Linked