This is the dataset called NFL, I tried to run XG Boost, but the error showed me:
Error in xgb.DMatrix(X_Train, label = labels) : 'data' has class 'character' and length 64617. 'data' accepts either a numeric matrix or a single filename.
The raw dataset is called NFL I'm trying to set "outcome" as predictor, and I want to make it as numeric. The "outcome" variable has "Win", "Tie", "Loss", I'm trying to show it in dataset as "1", "2", "3"
Here is the code
NFL <- NFL %>% mutate(id = row_number())
#Devided in two groups: TrainSet and validate
trainSet <- train %>% sample_frac(0.7)
validate <- train %>% anti_join(trainSet)
#xg boost
set.seed(112321)
X_Train <- trainSet %>% select(-outcome) %>% as.matrix()
X_Test <- validate %>% select(-target) %>% as.matrix()
labels <- trainSet$outcome %>% as.matrix()
Train <- xgb.DMatrix(X_Train, label = labels)
xgbModel <- xgboost(data = trainSet, objective = "classification" ,
nrounds = 50, subsample=1, colsample_bytree = 1, max_depth = 10,
eta=0.2, verbose=FALSE)
xgbPred <- predict(xgbModel, validate)
xgbROC <- evaluate(xgbPred, validate$target)enter code here
Can anybody tell me how to fix this? Thank you very much!
Update: I tried to use:
NFL%>% mutate(outcome = ifelse(outcome, c("Win", "Tie", "Loss",1,2,3)))
But it comes with all NAs, here is the photo NA/s