I keep getting the "class and train have different lengths" error when trying to use the knn model on my dataset.
newDF<- newDF[c(14, 1:13)]
newDF
str(newDF)
newDF1 <- newDF[c(2:11, 14)]
newDF1
df_train = newDF1[1:47385,]
dim(df_train)
df_test = newDF1[47386:59231,]
dim(df_test)
train_lbl <- newDF[1:47385,1]
test_lbl <- newDF[47386:59231,1]
dim(train_lbl)
install.packages("class")
library(class)
newDF_pred <- knn(train = df_train, test = df_test, cl = train_lbl, k = 245)
CrossTable(x = test_lbl, y=newDF_pred, propchisq=FALSE)
newDF is my entire dataset, while newDF1 is inclusive only of datatype "num"
Where is the issue and how can I fix it?
This is the data:
-10lgP
Mass Length ppm m/z
RT start end Intensity Sample 9
Precursor Id
range
1 0.543 0.234 0.245 0.348 0.0310 0.543 0.234 0.245 0.348 0.0310 0.0254
2 0.198 0.476 0.499 0.348 0.588 0.198 0.476 0.499 0.348 0.588 0.0256
3 0.234 0.245 0.348 0.0310 0.543 0.234 0.245 0.348 0.0310 0.543 0.0255
4 0.476 0.499 0.348 0.588 0.198 0.476 0.499 0.348 0.588 0.198 0.0254
5 0.245 0.348 0.0310 0.543 0.234 0.245 0.348 0.0310 0.543 0.234 0.0254
6 0.499 0.348 0.588 0.198 0.476 0.499 0.348 0.588 0.198 0.476 0.0256
7 0.348 0.0310 0.543 0.234 0.245 0.348 0.0310 0.543 0.234 0.245 0.0255
8 0.348 0.588 0.198 0.476 0.499 0.348 0.588 0.198 0.476 0.499 0.0254
9 0.0310 0.543 0.234 0.245 0.348 0.0310 0.543 0.234 0.245 0.348 0.0254
10 0.588 0.198 0.476 0.499 0.348 0.588 0.198 0.476 0.499 0.348 0.0256
... with 59,221 more rows
The size for the class and train are as follows: dim(train_lbl) [1] 47385 1
dim(df_train) [1] 47385 11