0

I am posting this because this postfeture selection in caret hasent helped my issue and I have 2 questions regarding feature selection function in caret package

when I run code below on my matrix of gene expression allsamplecombat with 5 classes defined in y= :

control <- rfeControl(functions=rfFuncs, method="cv", number=10) results <- rfe(t(allsamplecombat[filter,]), y = factor(info$clust), sizes=c(300,400,500,600,700,800,1000,1200), rfeControl=control) I get an out put like this

output of rf

So, I want to know if I can extract top features for each classes, because predictors(results) just give me the resulting feature without indicating importance for each classes.

my second problem is that when i try to change rfeControl functions to treebagFuncs and run 'parRF` method

control <- rfeControl(functions=treebagFuncs, method="cv", number=5) results <- rfe(t(allsamplecombat[filter,]), y = factor(info$clust), sizes=c(400,500,600,700,800), rfeControl=control, method="parRF") i get Error in { : task 1 failed - "subscript out of bounds" error.

what is wrong in my code?

jmuhlenkamp
  • 2,102
  • 1
  • 14
  • 37
Seymoo
  • 177
  • 2
  • 15

1 Answers1

0

For the importances, there is a sub-object called variables that contains this information for each step of the elimination.

treebagFuncs is designed to work with ipred's bagging function and isn't related to random forest.

You would probably used caretFuncs and pass method to that. However, if you are going to parallelize something, do it to the resampling loop and not the model function. This is generally more efficient. Note that if you do both with M workers, you might actually get M^3 (one for rfe, one for train, and one for parRF). There are options in rfe and train to turn their parallelism off.

topepo
  • 13,534
  • 3
  • 39
  • 52