9

I am trying to understand the catboost overfitting detector. It is described here:

https://tech.yandex.com/catboost/doc/dg/concepts/overfitting-detector-docpage/#overfitting-detector

Other gradient boosting packages like lightgbm and xgboost use a parameter called early_stopping_rounds, which is easy to understand (it stops the training once the validation error hasn't decreased in early_stopping_round steps).

However I have a hard time understanding the p_value approach used by catboost. Can anyone explain how this overfitting detector works and when it stops the training?

ftiaronsem
  • 1,524
  • 4
  • 19
  • 32

3 Answers3

18

It's not documented on the Yandex website or at the github repository, but if you look carefully through the python code posted to github (specifically here), you will see that the overfitting detector is activated by setting "od_type" in the parameters. Reviewing the recent commits on github, the catboost developers also recently implemented a tool similar to the "early_stopping_rounds" parameter used by lightGBM and xgboost, called "Iter." To set the number of rounds after the most recent best iteration to wait before stopping, provide a numeric value in the "od_wait" parameter.

For example:

fit_param <- list(
  iterations = 500,
  thread_count = 10,
  loss_function = "Logloss",
  depth = 6,
  learning_rate = 0.03,
  od_type = "Iter",
  od_wait = 100
)

I am using the catboost library with R 3.4.1. I have found that setting the "od_type" and "od_wait" parameters in the fit_param list works well for my purposes.

I realize this is not answering your question about the way to use the p_value approach also implemented by the catboost developers; unfortunately I cannot help you there. Hopefully someone else can explain that setting to the both of us.

babbeuf
  • 181
  • 4
  • 1
    thanks so much for sharing this! I did not know about the od_type and od_wait parameters. This is really appreciated! – ftiaronsem Aug 07 '17 at 21:20
  • 1
    No problem! The Yandex documentation isn't perfect, so I started poking through the python code over the weekend to see what might be missing. It was a very happy discovery for me, too. – babbeuf Aug 07 '17 at 22:24
12

Catboost now supports early_stopping_rounds: fit method parameters

Sets the overfitting detector type to Iter and stops the training after the specified number of iterations since the iteration with the optimal metric value.

This works very much like early_stopping_rounds in xgboost.

Here is an example:

from catboost import CatBoostRegressor, Pool

from sklearn.model_selection import train_test_split
import numpy as np 

y = np.random.normal(0, 1, 1000)
X = np.random.normal(0, 1, (1000, 1))
X[:, 0] += y * 2

X_train, X_eval, y_train, y_eval = train_test_split(X, y, test_size=0.1)

train_pool = Pool(X, y)
eval_pool = Pool(X_eval, y_eval)

model = CatBoostRegressor(iterations=1000, learning_rate=0.1)

model.fit(X, y, eval_set=eval_pool, early_stopping_rounds=10)

The result should be something like this:

522:    learn: 0.3994718        test: 0.4294720 best: 0.4292901 (514)   total: 957ms    remaining: 873ms
523:    learn: 0.3994580        test: 0.4294614 best: 0.4292901 (514)   total: 958ms    remaining: 870ms
524:    learn: 0.3994495        test: 0.4294806 best: 0.4292901 (514)   total: 959ms    remaining: 867ms
Stopped by overfitting detector  (10 iterations wait)

bestTest = 0.4292900745
bestIteration = 514

Shrink model to first 515 iterations.
Akavall
  • 82,592
  • 51
  • 207
  • 251
1

early_stopping_rounds takes into account both od_type='Iter' and od_wait parameters. No need to individually set both od_type and od_wait, just set early_stopping_rounds parameter.

Mainak Sen
  • 63
  • 6