5

I would like to perform feature selection by a wrapper method on the iris data set using mlr package, however I would like to look only at groups of features associated with Petal and/or Sepal. So instead of looking at 4 features in different combinations the wrapper routine would look at two groups of features in different combinations.

The mlr documentation states this can be performed using two arguments bit.names and bit.to.feature:

bit.names [character] Names of bits encoding the solutions. Also defines the total number of bits in the encoding. Per default these are the feature names of the task.

bits.to.features [function(x, task)] Function which transforms an integer-0-1 vector into a character vector of selected features. Per default a value of 1 in the ith bit selects the ith feature to be in the candidate solution.

I could not find any examples of usage of these two arguments in mlr tutorials or elsewhere.

I will use the example provided in ?mlr::selectFeatures.

First operating on all the features

library(mlr)
rdesc <- makeResampleDesc("Holdout")
ctrl <- makeFeatSelControlSequential(method = "sfs",
                                    maxit = NA)
res <- selectFeatures("classif.rpart",
                     iris.task,
                     rdesc,
                     control = ctrl)
analyzeFeatSelResult(res)

This works as expected

In order to run over groups of features I design a 0/1 matrix to map features to groups (I am not sure if this is the way to go, it just seemed logical):

mati <- rbind(
  c(0,0,1,1),
  c(1,1,0,0))

rownames(mati) <- c("Petal", "Sepal")
colnames(mati) <- getTaskFeatureNames(iris.task)

the matrix looks like:

      Sepal.Length Sepal.Width Petal.Length Petal.Width
Petal            0           0            1           1
Sepal            1           1            0           0

and now I run:

res <- selectFeatures("classif.rpart",
                     iris.task,
                     rdesc,
                     control = ctrl,
                     bit.names = c("Petal", "Sepal"),
                     bits.to.features = function(x = mati, task) mlr:::binaryToFeatures(x, getTaskFeatureNames(task)))

analyzeFeatSelResult(res)
#output
Features         : 1
Performance      : mmce.test.mean=0.0200000
Sepal

Path to optimum:
- Features:    0  Init   :                       Perf = 0.66  Diff: NA  *
- Features:    1  Add    : Sepal                 Perf = 0.02  Diff: 0.64  *

Stopped, because no improving feature was found.

This appears to perform what I need but I am not quite sure I defined bits.to.features argument correctly.

But when I try to use the same approach in a wrapper:

outer <- makeResampleDesc("CV", iters = 2L)
inner <- makeResampleDesc("Holdout")
ctrl <- makeFeatSelControlSequential(method = "sfs",
                                     maxit = NA)


lrn <- makeFeatSelWrapper("classif.rpart",
                          resampling = inner,
                          control = ctrl,
                          bit.names = c("Petal", "Sepal"),
                          bits.to.features = function(x = mati, task) mlr:::binaryToFeatures(x, getTaskFeatureNames(task)))


r <- resample(lrn, iris.task, outer, extract = getFeatSelResult)

I receive an error:

Resampling: cross-validation
Measures:             mmce      
[FeatSel] Started selecting features for learner 'classif.rpart'
With control class: FeatSelControlSequential
Imputation value: 1
[FeatSel-x] 1: 00 (0 bits)
[FeatSel-y] 1: mmce.test.mean=0.7200000; time: 0.0 min
[FeatSel-x] 2: 10 (1 bits)
[FeatSel-y] 2: mmce.test.mean=0.0800000; time: 0.0 min
[FeatSel-x] 2: 01 (1 bits)
[FeatSel-y] 2: mmce.test.mean=0.0000000; time: 0.0 min
[FeatSel-x] 3: 11 (2 bits)
[FeatSel-y] 3: mmce.test.mean=0.0800000; time: 0.0 min
[FeatSel] Result: Sepal (1 bits)
Error in `[.data.frame`(df, , j, drop = drop) : 
  undefined columns selected

What am I doing wrong and what is the correct usage of bit.names and bit.to.feature arguments?

Thanks

EDIT: I posted an issue on mlr github: https://github.com/mlr-org/mlr/issues/2468

missuse
  • 19,056
  • 3
  • 25
  • 47
  • This issue is now resolved and resampling the FeatSelWrapper should work now with custom bits as well. – jakob-r Nov 09 '18 at 10:43

1 Answers1

3

I guess you found two bugs. The first is that your code even runs and the second one is that this won't work with nested resampling.

Bug 1: Your code should not run

First of all mati does not have any effect because it will be overwritten by every internal call of bits.to.features. After all you just defined a default argument.

What you defined the bit.names "Petal" and "Sepal" you basically just told mlr to use two bits. So the feature selection will work with the vectors 00, 01, 10, 11. Unfortunately R now automatically recycles these vectors to the length of 4 so 10 becomes 1010:

mlr:::binaryToFeatures(c(1,0), getTaskFeatureNames(iris.task))
# [1] "Sepal.Length" "Petal.Length"

There we have our first bug, that mlr should avoid the vector recycling here.

To make the code run like intended you could define the function bits.to.features like this:

bitnames = c("Sepal", "Petal")
btf = function(x, task) {
  sets = list(
    c("Sepal.Length", "Sepal.Width"), 
    c("Petal.Length", "Petal.Width")
  )
  res = unlist(sets[as.logical(x)])
  if (is.null(res)) {
    return(character(0L))
  } else {
    return(res)  
  }
}

res <- selectFeatures("classif.rpart", iris.task, rdesc, 
  control = ctrl, bits.to.features = btf, bit.names = bitnames)

Explanation of bts

Quoting the help page of selectFeatures:

[function(x, task)] Function which transforms an integer-0-1 vector into a character vector of selected features. Per default a value of 1 in the ith bit selects the ith feature to be in the candidate solution.

So x is a vector containing 0s and 1s (e.g. c(0,0,1,0)). If you don't change that function it would return the name of the third feature (e.g. "Petal.Length" for iris). The vector xwill be always of the same length as the defined bit.names. The resulting character vector however can be of any length. It just has to return valid feature names for the task.

In the example I hardcoded the feature names into the function bts. This is bad practice if you want to apply the function on many different tasks. Therefore mlr gives you access to the task object and therefore also on the feature names through getTaskFeatureNames(task) so you can generate the feature names programmatically and not hard coded.

Bug 2: The bit.names have to be feature names

The Feature Selection returns the bitnames as a result. Then mlr tries to select these bitnames in the dataset but obviously they are not present as these are totally unrelated (in your case). This bug is now resolved in the github version of mlr.

jakob-r
  • 6,824
  • 3
  • 29
  • 47
  • Thank you. Could you elaborate on the `btf` function - why does it have a `task` argument which is not used (just because of the check later on in the function?) and what is the argument `x` in there. Is `x` in fact the feature vector? So in this case where there are two bits the `sets` list is being subset by `c(0,0)`, `c(1,0)` and so on? Isn't `bitnames = c("Sepal", "Petal")` contradictory to Bug 2 (should not be permitted?). The code produces the same error when I attempt it `makeFeatSelWrapper`. – missuse Nov 06 '18 at 08:41
  • How would one need to define `bitnames` so it runs during nested re-sampling? And still be able to do feature selection by groups of features. – missuse Nov 06 '18 at 08:45
  • 1) I added an explanation which hopefully clarifies how it works 2) This is not an easy fix because like in the example the result might be that "Sepal" and "Petal" together give the best performance. Now the result of the Feature Selection is the vector `c("Sepal", "Petal")`. Then mlr tries to select these features but they can not be found because they don't exist. So it has to be fixed inside mlr. – jakob-r Nov 06 '18 at 09:22