2

Situation: I am trying to use XGBoost classifier, however this error pops up to me: "ValueError: Invalid classes inferred from unique values of y. Expected: [0 1 2 ... 1387 1388 1389], got [0 1 2 ... 18609 24127 41850]".

Unlike this solved one: Invalid classes inferred from unique values of `y`. Expected: [0 1 2 3 4 5], got [1 2 3 4 5 6], it seems that I have a different scenario which is about not starting from 0.

Code:

X = data_concat
y = data_concat[['forward_count','comment_count','like_count']]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=72)

#Train, test split
print ('Train set:', X_train.shape,  y_train.shape)     #Check the size after split
print ('Test set:', X_test.shape,  y_test.shape)

xgb = XGBClassifier()
clf = xgb.fit(X_train, y_train, eval_metric='auc')  #HERE IS WHERE GET THE ERROR

The Datafrme and frame info is like this: DataFrame

DataFrame Info.

I have adopted different y, meaning when y has less or more columns, the list "[0 1 2 ... 1387 1388 1389]" will simultaneously shrink or expand.

If you need further info, please let me know. Appreciate your help :)

1 Answers1

1

Need to transform the y_train value to fit xgboost, it starts from 0 but not 1. Here is the code:

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y_train = le.fit_transform(y_train)
Julia Meshcheryakova
  • 3,162
  • 3
  • 22
  • 42
xue leon
  • 11
  • 1