I noticed that there are two possible implementations of XGBoost in Python as discussed here
When I tried running the same dataset through the two possible implementations I noticed that the results were different.
Using the low level API - xgboost.train(..)
dtrain = xgboost.DMatrix(X, label=Y, missing=0.0)
param = {'max_depth' : 3, 'objective' : 'reg:squarederror', 'booster' : 'gbtree'}
evallist = [(dtrain, 'eval'), (dtrain, 'train')]
num_round = 10
xgb_dMatrix = xgboost.train(param, dtrain, num_round, evallist)
Output
[0] eval-rmse:7115.31 train-rmse:7115.31
[1] eval-rmse:5335.37 train-rmse:5335.37
[2] eval-rmse:4054.77 train-rmse:4054.77
[3] eval-rmse:3140.91 train-rmse:3140.91
[4] eval-rmse:2510.33 train-rmse:2510.33
[5] eval-rmse:2080.62 train-rmse:2080.62
[6] eval-rmse:1785.53 train-rmse:1785.53
[7] eval-rmse:1571.92 train-rmse:1571.92
[8] eval-rmse:1399.57 train-rmse:1399.57
[9] eval-rmse:1301.64 train-rmse:1301.64
Using the Scikit Wrapper - xgboost.XGBRegressor(..)
xgb_reg = xgboost.XGBRegressor(max_depth=3, n_estimators=10)
xgb_reg.fit(X_train, Y_train, eval_set = [(X_train, Y_train), (X_train, Y_train)], eval_metric = 'rmse', verbose=True)
Output
[0] validation_0-rmse:8827.63 validation_1-rmse:8827.63
[1] validation_0-rmse:8048.16 validation_1-rmse:8048.16
[2] validation_0-rmse:7349.83 validation_1-rmse:7349.83
[3] validation_0-rmse:6720.69 validation_1-rmse:6720.69
[4] validation_0-rmse:6154.82 validation_1-rmse:6154.82
[5] validation_0-rmse:5637.49 validation_1-rmse:5637.49
[6] validation_0-rmse:5173.9 validation_1-rmse:5173.9
[7] validation_0-rmse:4759.14 validation_1-rmse:4759.14
[8] validation_0-rmse:4386.29 validation_1-rmse:4386.29
[9] validation_0-rmse:4051.63 validation_1-rmse:4051.63
I thought the parameters were the cause for the difference so I fetched the parameters from the scikit wrapper implementation and passed it to the low level API implementation and still observed that the results were different. Code for parameters
xgb_reg.get_params()
Just wondering what could be the possible reason why the results are not matching between the two versions which internally are similar?