Anyone know why I am getting this error? The attribute 'Kön' is actually called 'Kön' in the training file. And I don't have a column with the name training in the training file.
ctrl <- trainControl(method = "repeatedcv", number = 10, savePredictions =
TRUE)
mod_fit <- train(as.factor(training$Riskdrickare)~., data=training, method="glm",family="binomial", trControl = ctrl, tuneLength = 5)
Error: Unknown columns 'training', '<U+FEFF>Kön'
In addition: There were 11 warnings (use warnings() to see them)
I am suppose to run this code afterwards:
pred = predict(mod_fit, newdata=test)
confusionMatrix(data=pred, as.factor(test$Riskdrickare))
The warnings states:
Warning messages:
1: In train.default(x, y, weights = w, ...) :
You are trying to do regression and your outcome only has two possible values Are you trying to do classification? If so, use a 2 level factor as your outcome column.
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
3: glm.fit: fitted probabilities numerically 0 or 1 occurred
4: glm.fit: fitted probabilities numerically 0 or 1 occurred
5: glm.fit: fitted probabilities numerically 0 or 1 occurred
6: glm.fit: fitted probabilities numerically 0 or 1 occurred
7: glm.fit: fitted probabilities numerically 0 or 1 occurred
8: glm.fit: fitted probabilities numerically 0 or 1 occurred
9: glm.fit: fitted probabilities numerically 0 or 1 occurred
10: glm.fit: fitted probabilities numerically 0 or 1 occurred
11: glm.fit: fitted probabilities numerically 0 or 1 occurred
12: glm.fit: fitted probabilities numerically 0 or 1 occurred
The colum names are:
Kön Ålder Nationalitet Fakultet Riskdrickare Terminer Omtenta Sektionsaktiv studieTim Träning Frysmat Frukost Vegan Sömn Relation dator dAlc wAlc FamRel fHealth mhealth Smoke fSize fTog pStudEvent schemUnd drickStudenEven fritidVän FritidTid
They consist of both numerical and categorical values.
A reproducible example would be an equivalent dataset for this assignment.
If I apply as.factor(Riskdrickare)
then the first error goes away but the rest remain.
Solved: By setting the encoding to Ansi and remove all the letters 'å'. 'ä' and 'ö' from the colummn names removed the error message.