I am using the Learning API version of xgboost. I want to get the coefficients of a linear model using this, but it results in an error AttributeError: 'Booster' object has no attribute 'coef_'
. The Learning API documentation doesn't appear to address how to retrieve coefficients.
###xtrain and ytrain are numpy arrays
dtrain = xgb.DMatrix(xtrain, label=ytrain)
dtest = xgb.DMatrix(xtest, label=ytest)
param = {'eta':0.3125, 'objective': 'binary:logistic' 'nthread':8, 'eval_metric':'auc', 'booster':'gblinear', 'max_depth':12}
model = xgb.train(param, dtrain, 60, [(dtrain, 'train'), (dtest, 'eval')], verbose_eval = 5, early_stopping_rounds = 12)
print(model.coef_) #results in an error
I tried building an equivalent version of the above model using XGBRegressor
as it does have the attribute coef_
, but this model does return predictions that are very different. I looked at previous answers on this topic (1, 2), which seem to imply that n_estimators
is effectively the same as num_boost_round
and that would provide the same predictions. But despite accounting for this, the predictions are very different based on the parameters below. This model turns out to be extremely conservative. Also, from the documentation, nthread
is the same as n_jobs
. I don't see any other differences in the parameters of the two.
model = XGBRegressor(n_estimators = 60, learning_rate = 0.3125, max_depth = 12, objective = 'binary:logistic', booster = 'gblinear', n_jobs = 8)
model = model.fit(xtrain, ytrain, eval_metric = 'auc', early_stopping_rounds = 12, eval_set = [(xtest, ytest)])
predictions = model.predict(xtrain, ntree_limit = 0) # need to include ntree_limit because of bug associated with early_stopping_rounds for gblinear
My questions are:
- Is there a way to get coefficients for a model built using
xgb.train
for a linear model, and if so how may I do it? - If not, why is
XGBRegressor
giving me different results?