I try to predict with a simplified KNN model using the caret package in R. It always gives the same error, even in the very simple reproducible example here:
library(caret)
set.seed(1)
#generate training dataset "a"
n = 10000
a = matrix(rnorm(n*8,sd=1000000),nrow = n)
y = round(runif(n))
a = cbind(y,a)
a = as.data.frame(a)
a[,1] = as.factor(a[,1])
colnames(a) = c("y",paste0("V",1:8))
#estimate simple KNN model
ctrl <- trainControl(method="none",repeats = 1)
knnFit <- train(y ~ ., data = a, method = "knn", trControl = ctrl, preProcess = c("center","scale"), tuneGrid = data.frame(k = 10))
#predict on the training dataset (=useless, but should work)
knnPredict <- predict(knnFit,newdata = a, type="prob")
This gives
Error in [.data.frame
(out, , obsLevels, drop = FALSE) :
undefined columns selected
Defining a more realistic test dataset "b" without the target variable y...
#generate test dataset
b = matrix(rnorm(n*8,sd=1000000),nrow = n)
b = as.data.frame(b)
colnames(b) = c(paste0("V",1:8))
#predict on the test datase
knnPredict <- predict(knnFit,newdata = b, type="prob")
gives the same error
Error in [.data.frame
(out, , obsLevels, drop = FALSE) :
undefined columns selected
I know that the columnames are important, but here they are identical. What is wrong here? Thanks!