4

I'm trying to implement binary svm. I've got the following error message:

Error in if (any(co)) { : missing value where TRUE/FALSE needed

For the following code:

library(e1071)
dataset <- read.csv("C:/Users/Backup/Desktop/pos.csv")
# Subset the dataset dataset to only 2 labels and 2 features
dataset.part = subset(dataset, label != 1)
dataset.part$label = factor(dataset.part$label)


# Fit svm model
fit = svm(label ~ ., data=dataset.part, type='C-classification',   kernel='linear')

I'm getting the error on this line of code:

# Fit svm model
fit = svm(label ~ ., data=dataset.part, type='C-classification', kernel='linear')

I'm a beginner in R and i don't know how to solve this. Could any one help me?

sh ze
  • 169
  • 2
  • 5
  • 14
  • 1
    Which line are you getting the error on? You should create a *minimal* [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so we can help troubleshoot. This means you need to include sample data otherwise we have no idea what you are feeding into these functions. – MrFlick Apr 21 '15 at 16:57
  • I'm getting the error on this line of code ( # Fit svm model fit = svm(label ~ ., data=dataset.part, type='C-classification', kernel='linear') ) as for data I'm feeding these functions with a positive and negative data, data is numerical. – sh ze Apr 21 '15 at 17:04
  • 1
    Well, then git rid of all the extra lines of code. They are irrelevant to the question. Also, follow the suggestions on the provided link to provide a data set similar to your own that will reproduce the error. – MrFlick Apr 21 '15 at 17:05

1 Answers1

2

I ran into the same problem and was able to narrow it down to an infinite value in one of my rows. My suggestion would be to first preprocess the data. In my case I can afford to just omit na's and infinite values

# First cast Inf to NA
is.na(df) <- sapply(df, is.infinite)
# Now just omit NA
na.omit(df)

However for general problem solving @MrFlick's solution is a good one. And if all else fails, it is probably easiest to just do a binary search on the data set to see which data points are causing trouble

Greg
  • 5,422
  • 1
  • 27
  • 32