I am trying to create model using XGBoost.
It seems like I manage to train the model, however, when I try to predict my test data and to see the actual prediction, I get the following error:
ValueError: Data must be 1-dimensional
This is how I tried to predict my data:
from dask_ml.model_selection import train_test_split
import dask
import xgboost
import dask_xgboost
from dask.distributed import Client
import dask_ml.model_selection as dcv
#split the data
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.33,random_state=42)
client = Client(n_workers=10, threads_per_worker=1)
#Trying to do hyperparamter running
model_xgb = xgb.XGBRegressor(seed=42,verbose=True)
params={
'learning_rate':[0.1,0.01,0.05],
'max_depth':[1,5,8],
'gamma':[0,0.5,1],
'scale_pos_weight':[1,3,5]
}
grid_search = GridSearchCV(model_xgb, params, cv=3, scoring='neg_mean_squared_error')
grid_search.fit(x_train, y_train)
#train data with best paraeters
bst = dask_xgboost.train(client, grid_search.best_params_, x_train, y_train, num_boost_round=10)
#predict data
dask_xgboost.predict(client, bst, x_test).persist()
The last line with the predict works, but when I addl compute to the endd in order to see the actual array I get the dimensional error:
dask_xgboost.predict(client, bst, x_test).persist().compute()
>>>ValueError: Data must be 1-dimensional
How can I get predictions with .predict
?