One class SVM always returns FALSE

Question

I have a project in my ML course about anomaly/novelty detection and decided to study the One-class SVM algorithm as described in this paper: http://research.microsoft.com/pubs/69731/tr-99-87.pdf. In the package e1071 in R there is an svm function that seems to support one-class classification. However, when I try to use it the predictor always returns false (even on the training set, which is the weirdest thing). Here is my code :

library(e1071) # for svm classifier
library(IMIFA) # for USPS dataset
library(caret) # for confusion matrices

data(USPSdigits)

digits.train <- USPSdigits$train
digits.train <- digits.train[order(digits.train$V1), ]
digits.train$is.zero[digits.train$V1 == 0] <- "TRUE"
digits.train$is.zero[digits.train$V1 != 0] <- "FALSE"
digits.test <- USPSdigits$test
digits.test <- digits.test[order(digits.test$V1), ]
digits.test$is.zero[digits.test$V1 == 0] <- "TRUE"
digits.test$is.zero[digits.test$V1 != 0] <- "FALSE"

digits.train.features <- digits.train[digits.train$V1 == 0, -c(1, 258)]
digits.train.labels <- digits.train[digits.train$V1 == 0, 258]
digits.train.nu <- 0.5
digits.train.bandwith <- 0.5*256
digits.train.model <- svm(x = digits.train.features, type = 'one-classification', kernel = 'radial', nu = digits.train.nu, gamma = digits.train.bandwith)
digits.train.fitted <- predict(digits.train.model, digits.train.features)
digits.train.confusionMatrix <- table(Predicted = digits.train.fitted, Reference = digits.train.labels)
print(digits.train.confusionMatrix)

digits.test.features <- subset(digits.test, select = -c(is.zero, V1))
digits.test.labels <- digits.test$is.zero
digits.test.fitted <- predict(digits.train.model, digits.test.features)
digits.test.confusionMatrix <- table(Predicted = digits.test.fitted, Reference = digits.test.labels)
print(digits.test.confusionMatrix)

and my output is :

> print(digits.train.confusionMatrix)
         Reference
Predicted TRUE
    FALSE 1194
> print(digits.test.confusionMatrix)
         Reference
Predicted FALSE TRUE
    FALSE  1648  359

What am I doing wrong?

Does [this post](https://stackoverflow.com/questions/27375517/one-class-classification-with-svm-in-r) help you? — phiver, Nov 27 '21 at 13:43
I have already seen this post and my code is inspired by it, but on this more complicated dataset it doesn't seem to work — tudoroancea, Nov 27 '21 at 15:45

score 0 · Answer 1 · answered Nov 27 '21 at 16:42

I created a working example. Adjust naming conventions to your own. I have used a very expressive naming convention to show exactly what I did.

Training on only 1 number results in a lot of columns with the same values. These return as errors in svm and should be removed. The function nearZeroVar from caret is an ideal function for this. If you ever use the recipes package, it is called step_nzv.

library(e1071)
# library(caret) # caret used for nearZeroVar function.

# data steps like OP's
digits.train <- USPSdigits$train
digits.test <- USPSdigits$test

digits.train$is.zero[digits.train$V1 == 0] <- "TRUE"
digits.train$is.zero[digits.train$V1 != 0] <- "FALSE"


digits.test$is.zero[digits.test$V1 == 0] <- "TRUE"
digits.test$is.zero[digits.test$V1 != 0] <- "FALSE"


train_the_positives <- subset(digits.train, is.zero == "TRUE")

# get the columns with 99% of the same values
cols_to_remove <- caret::nearZeroVar(train_the_positives, freqCut = 99/1)

svm.model <- svm(train_the_positives[, -cols_to_remove],
                 type = 'one-classification',
                 nu = 0.10,
                 kernel = "radial")

# predictions on train_the_positives set
svm_predictions_on_train_the_positives <- predict(svm.model, train_the_positives[, -cols_to_remove])
table(Predicted = svm_predictions_on_train_the_positives,
      Reference = train_the_positives$is.zero)

         Reference
Predicted TRUE
    FALSE  121
    TRUE  1073

# predictions on full train set
svm_prediction_on_full_train_set <- predict(svm.model, digits.train[, -cols_to_remove])
table(Predicted = svm_prediction_on_full_train_set,
      Reference = digits.train$is.zero)

         Reference
Predicted FALSE TRUE
    FALSE  6069  121
    TRUE     28 1073


# predictions on test set
svm_prediction_on_test_set <- predict(svm.model, digits.test[, -cols_to_remove])
table(Predicted = svm_prediction_on_test_set,
      Reference = digits.test$is.zero)

         Reference
Predicted FALSE TRUE
    FALSE  1638   68
    TRUE     10  291

My god, it works as wonder !! Thank you very much !! Do you have an idea where the error could come from though? Rather from a mathematical aspect or rather a computation aspect? — tudoroancea, Nov 28 '21 at 08:48
@tudoroancea, I noticed when I ran your svm, it generated an error in the calculations when running, but still created a model. Problably due to some stationary variables. Also, please accept the answer as that will mark the question as closed. — phiver, Nov 28 '21 at 09:28

One class SVM always returns FALSE

1 Answers1