3

I am trying to use Caret's rfe function to do feature selection. My code worked not even a few days ago. Now I am getting a subscript of of bounds error. The weird part is I can run the rfe function with some built in data from another package no problem, which to me means that this is likely an issue with my data (but I can't figure out what). Any suggestions??

WORKS

load the library

library(mlbench)
library(caret)
library(randomForest)

load the data

data(PimaIndiansDiabetes)

define the control using a random forest selection function

control <- rfeControl(functions=rfFuncs2, method="cv", number=10)

run the RFE algorithm

results <- rfe(PimaIndiansDiabetes[,1:8], PimaIndiansDiabetes[,9], sizes=c(1:8), rfeControl=control)

DOESN'T WORK

results<-rfe(stores[,10:33], stores[,8],sizes=c(1:24), rfeControl=control)

My dataframe "stores" is a bunch of continuous variables (10:33), and a grouping variable (8)

Any Thoughts?

Error Message

StupidWolf
  • 45,075
  • 17
  • 40
  • 72

2 Answers2

2

I meet the same problem, you can try as.factor(unlist(stores[,8]))

Tttori
  • 21
  • 2
1

This error indicates that your 'y' variable (e.g., stores[,8]) is coded as a character not a factor variable. Use the following snippet to change your character to factor and RFE will run: stores[,8] <- as.factor(stores[,8])

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 25 '23 at 19:09