0

I'm having a similar issue to this post when I try to do rfe using lrFuncs. I tried their suggestions but they did not resolve my issue. Let's take the GermanCredit dataset in the caret package as an example. In this dataset all the factors (except for the target variable Class) have already been converted to binary numeric variables, so we don't need to worry about using model.matrix.

> library(caret)
> data(GermanCredit)
> GCrfe <- rfe(GermanCredit[,c(1:9,11:62)], GermanCredit[,10], sizes=(1:50), rfeControl=rfeControl(functions=lrFuncs))
Error in { : 
  task 1 failed - "rfe is expecting 61 importance values but only has 48"

Okay so then I look at variables with no variance (except for the target variable Class) and remove variables with no variance (ie. only one unique value).

> variableVariance <- sapply(GermanCredit[-10], function(x) length(unique(x)))
> which(variableVariance==1)
      Purpose.Vacation Personal.Female.Single 
                    26                     44 
> GermanCredit <- GermanCredit[-grep('Purpose.Vacation', names(GermanCredit))]
> GermanCredit <- GermanCredit[-grep('Personal.Female.Single', names(GermanCredit))]

Now I look at correlated variables and get rid of 'duplicates'.

> Cor <- abs(cor(GermanCredit[-10]))
> diag(Cor) <- 0
> which(Cor > 0.8, arr.ind=T)
                           row col
OtherInstallmentPlans.None  52  50
OtherInstallmentPlans.Bank  50  52
> GermanCredit <- GermanCredit[-grep('OtherInstallmentPlans.Bank', names(GermanCredit))]

If I try rfe now, I still get the same error.

> GCrfe <- rfe(GermanCredit[,c(1:9,11:59)], GermanCredit[,10], sizes=(1:50), rfeControl=rfeControl(functions=lrFuncs))
Error in { : 
  task 1 failed - "rfe is expecting 58 importance values but only has 48"

    > set.seed(12213)
    > index <- createFolds(GermanCredit$Class, k=10, returnTrain=T)
    > lrCtrl <- rfeControl(functions=lrFuncs, method='repeatedcv', index=index)
    > GCrfe <- rfe(GermanCredit[,c(1:9,11:59)], GermanCredit[,10], sizes=(1:50), rfeControl=lrCtrl)
    Error in { : 
      task 1 failed - "rfe is expecting 58 importance values but only has 48"

I'll be grateful for any help resolving this issue and understanding why this error occurs.

Community
  • 1
  • 1
Gaurav Bansal
  • 5,221
  • 14
  • 45
  • 91

1 Answers1

0

Ok I think I figured it out. I deleted one 'level' for each of the dummy factors as well as the two variables that had no variance and now it works.

Gaurav Bansal
  • 5,221
  • 14
  • 45
  • 91