I have a data frame of about 500 rows and 170 columns. I am attempting to run a classification model with svm from the e1071 package. The classification variable is called 'SEGMENT', a factor variable with 6 levels. There are three other factor variables in the data frame, and the rest are numeric.
data <- my.data.frame
# Split into training and testing sets, training.data and testing.data
.
.
.
fit <- svm(SEGMENT ~ ., data = training.data, cost = 1, kernel = 'linear',
+ probability = T, type = 'C-classification')
The model runs fine.
Parameters:
SVM-Type: C-classification
SVM-Kernel: linear
cost: 1
gamma: 0.0016
Number of Support Vectors: 77
( 43 2 19 2 2 9 )
Number of Classes: 6
Levels:
EE JJ LL RR SS WW
The problem arises when I try to test the model on data.testing, which is structured exactly like the training set:
x <- predict(fit, testing.data, decision.values = T, probability = T)
And then things blow up rather spectacularly:
Error in predict.svm(fit, newdata = testing, decision.values = T, probability = T) :
test data does not match model !
Ideas are most welcome.