In spite of setting a value for random_state
and/or seed
parameter, the performance is not reproducible in Xgboost Sklearn API wrapper
Here is the code
from xgboost.sklearn import XGBClassifier
from sklearn.metrics import roc_auc_score
(X_train,y_train),(X_eval,y_eval) = pickle.load(open('xxxx.pkl',"rb"))
hyperparams = {'eval_metric': 'auc', 'colsample_bylevel': 0.7, 'learning_rate': 0.125, 'random_state': 0}
GBM = XGBClassifier(**hyperparams)
GBM.fit(X_train,y_train,eval_metric="auc",verbose = True,eval_set=[(X_eval,y_eval)],early_stopping_rounds=2)
print(roc_auc_score(y_eval, GBM.predict_proba(X_eval)[:,1]))
Each time, when I run the above snippet the concordance value differs.
78.9451246
79.001542
Some references issues