I have an imbalanced dataset for training a rf model. Response variable is data$TA where TA is a factor with labels "NT" and "T" and values 1 and 2 (numeric). When attempting SMOTE:train_sm <- SMOTE(TA~., data = data)
, I get this error:
Error in factor(NewCases[, a], levels = 1:nlevels(data[, a]), labels = levels(data[, :
invalid 'labels'; length 0 should be 1 or 2
Using the advice from previous stackoverflow posts, I have tried to convert the data$TA into:
- a numeric vector
- a factor with numeric labels (tried using both
factor
andas.factor
)
But it did not solve my issue - the same error occured in all cases. Please tell me what else should I do to make it work?