I have a project in my ML course about anomaly/novelty detection and decided to study the One-class SVM algorithm as described in this paper: http://research.microsoft.com/pubs/69731/tr-99-87.pdf. In the package e1071
in R there is an svm
function that seems to support one-class classification. However, when I try to use it the predictor always returns false (even on the training set, which is the weirdest thing). Here is my code :
library(e1071) # for svm classifier
library(IMIFA) # for USPS dataset
library(caret) # for confusion matrices
data(USPSdigits)
digits.train <- USPSdigits$train
digits.train <- digits.train[order(digits.train$V1), ]
digits.train$is.zero[digits.train$V1 == 0] <- "TRUE"
digits.train$is.zero[digits.train$V1 != 0] <- "FALSE"
digits.test <- USPSdigits$test
digits.test <- digits.test[order(digits.test$V1), ]
digits.test$is.zero[digits.test$V1 == 0] <- "TRUE"
digits.test$is.zero[digits.test$V1 != 0] <- "FALSE"
digits.train.features <- digits.train[digits.train$V1 == 0, -c(1, 258)]
digits.train.labels <- digits.train[digits.train$V1 == 0, 258]
digits.train.nu <- 0.5
digits.train.bandwith <- 0.5*256
digits.train.model <- svm(x = digits.train.features, type = 'one-classification', kernel = 'radial', nu = digits.train.nu, gamma = digits.train.bandwith)
digits.train.fitted <- predict(digits.train.model, digits.train.features)
digits.train.confusionMatrix <- table(Predicted = digits.train.fitted, Reference = digits.train.labels)
print(digits.train.confusionMatrix)
digits.test.features <- subset(digits.test, select = -c(is.zero, V1))
digits.test.labels <- digits.test$is.zero
digits.test.fitted <- predict(digits.train.model, digits.test.features)
digits.test.confusionMatrix <- table(Predicted = digits.test.fitted, Reference = digits.test.labels)
print(digits.test.confusionMatrix)
and my output is :
> print(digits.train.confusionMatrix)
Reference
Predicted TRUE
FALSE 1194
> print(digits.test.confusionMatrix)
Reference
Predicted FALSE TRUE
FALSE 1648 359
What am I doing wrong?