Referring to Post# Filtering out columns in R , the columns with all 1's and 0's were successfully eliminated from the training_data. However, the classification algorithm still complaint about the columns where MOST of the values are 0's except 1 or 2 (All the values in the column are 0 except 1 or 2 values).
I am using penalizedSVM R package to perform feature selection. Looking more closely at the data set, the function svm.fs complains about the columns where most of the values are 0 except a one or two.
How one can modify (or add something to) the following code to achieve the result.
lambda1.scad<-c(seq(0.01, 0.05, .01), seq(0.1, 0.5, 0.2), 1)
lambda1.scad<-lambda1.scad[2:3]
seed <- 123
f0 <- function(x) any(x!=1) & any(x!=0) & is.numeric(x)
trainingdata <- lapply(trainingdata, function(data) cbind(label=data$label,
colwise(identity, f0)(data)))
datax <- trainingdata[[1]]
levels(datax$label) <- c(-1, 1)
train_x<-datax[, -1]
train_x<-data.matrix(train_x)
trainy<-datax[, 1]
idx <- is.na(train_x) | is.infinite(train_x)
train_x[idx] <- 0
tryCatch(scad.fix<-svm.fs(train_x, y=trainy, fs.method="scad",
cross.outer=0, grid.search="discrete",
lambda1.set=lambda1.scad, parms.coding="none",
show="none", maxIter=1000, inner.val.method="cv",
cross.inner=5, seed=seed, verbose=FALSE), error=function(e) e)
Or one may propose an entirely different solution.