6

I am creating custom learners, in particular I am trying to use the h2o machine learning algorithms within the mlr framework. The 'hidden' parameter of the h2o.deeplearning function, is an integer vector which I want to tune. I defined the 'hidden' parameter in the following way:

makeRLearner.classif.h2o_dl = function() {
makeRLearnerClassif(
cl = "classif.h2o_dl",
package = "h2o",
par.set = makeParamSet(
  makeDiscreteLearnerParam(id = "activation",
    values = c("Rectifier", "Tanh", "TanhWithDropout", "RectifierWithDropout", "Maxout", "MaxoutWithDropout")),
  makeNumericLearnerParam(id = "epochs", default = 10, lower = 1),
  makeNumericLearnerParam(id = "rate", default = 0.005, lower = 0, upper = 1),
  makeIntegerVectorLearnerParam(id = "hidden", default = c(100,100)),
  makeDiscreteLearnerParam(id = "loss", values = c("Automatic",
            "CrossEntropy", "Quadratic", "Absolute", "Huber"))
  ),
properties = c("twoclass", "multiclass", "numerics", "factors", "prob","missings"),
name = "Deep Learning Neural Network with h2o",
short.name = "h2o_deeplearning_classif",
note = "tbd"
)
}

trainLearner.classif.h2o_dl = function(.learner, .task,.subset,.weights=NULL, ...) {
f = getTaskFormula(.task)
data = getTaskData(.task, .subset)
data_h2o <- as.h2o(data,
                 destination_frame = paste0(
                   "train_",
                   format(Sys.time(), "%m%d%y_%H%M%S")))
h2o::h2o.deeplearning(x = getTaskFeatureNames(.task),
           y = setdiff(names(getTaskData(.task)),
                       getTaskFeatureNames(.task)),
           training_frame = data_h2o, ...)
}

predictLearner.classif.h2o_dl = function(.learner, .model, .newdata, predict.method = "plug-in", ...) {
data <- as.h2o(.newdata,
             destination_frame = paste0("pred_",
                                        format(Sys.time(), "%m%d%y_%H%M%S")))
p = predict(.model$learner.model, newdata = data, method = predict.method, ...)
if (.learner$predict.type == "response") 
return(as.data.frame(p)[,1]) else return(as.matrix(as.numeric(p))[,-1])
}

I tried tuning the parameter 'hidden' via grid search by means of the makeDiscreteParam function:

library(mlr)
library(h2o)
h2o.init()

lrn.h2o <- makeLearner("classif.h2o_dl")
n <- getTaskSize(sonar.task)
train.set = seq(1, n, by = 2)
test.set = seq(2, n, by = 2)
mod.h2o = train(lrn.h2o, sonar.task, subset = train.set)
pred.h2o <- predict(mod.h2o,task= sonar.task, subset = train.set)

ctrl = makeTuneControlGrid()
rdesc = makeResampleDesc("CV", iters = 3L)
ps = makeParamSet(
makeDiscreteParam("hidden", values = list(c(10,10),c(100,100))),
makeDiscreteParam("rate", values = c(0.1,0.5))
)

res = tuneParams("classif.h2o_dl", task = sonar.task, resampling = rdesc,par.set = ps,control = ctrl)

which resulted in the warning message

Warning messages:
1: In checkValuesForDiscreteParam(id, values) :
 number of items to replace is not a multiple of replacement length
2: In checkValuesForDiscreteParam(id, values) :
 number of items to replace is not a multiple of replacement length

and ps looks like this:

ps
           Type len Def  Constr Req Tunable Trafo
hidden discrete   -   -  10,100   -    TRUE     -
rate   discrete   -   - 0.1,0.5   -    TRUE     -

which does not result in tuning the hidden parameter as a vector. I also tried other special constructor function (e.g. makeNumericVectorParam) which did not work either. Has anyone experience in tuning (integer) vectors in mlr and could give me a hint?

Community
  • 1
  • 1
ptr_
  • 61
  • 5
  • It sounds like you need to use `makeNumericVectorParam` here. Can you share the code you've tried that didn't work please? – Lars Kotthoff Mar 01 '16 at 17:07
  • I just added the complete code – ptr_ Mar 02 '16 at 10:14
  • Hmm, if you want to try just those specific values I would introduce a dummy parameter that's simply an index into the list of values to try and check/convert that in the wrapper for the learner. – Lars Kotthoff Mar 02 '16 at 17:22
  • Yes, that should work in this case. But actually, I am trying to implement h2o algorithms as learners for mlr, hence it is important for me to define the hidden parameter in a proper way (if that's possible). – ptr_ Mar 02 '16 at 17:46

2 Answers2

2

To tune "hidden" parameter use this piece of code in the grid:

makeDiscreteParam(id = "hidden", values = list(a = c(10,10), b = c(100,100)))

Check this out:

https://github.com/mlr-org/mlr/issues/1305

Petter Friberg
  • 21,252
  • 9
  • 60
  • 109
perevales
  • 81
  • 1
  • 6
1

The reason for the warning messages and failure to construct the proper ParamSet is that ParamHelpers tries to add names to the list of values, which fails when values are vectors. perevales answer solves this issue and that's why it works.

However, when you want to tune a vector of integer values, it is probably most advisable to use makeIntegerVectorParam:

ps <- makeParamSet(
  makeIntegerVectorParam("hidden", len = 2, lower = 10, upper = 100),
  makeDiscreteParam("rate", values = c(0.1, 0.5))
)

This will not only try c(10, 10) and c(100, 100), but also values like c(10, 100).

In fact, this also considers all values between 10 and 100 (e.g. c(30, 80)), so it may be desirable to reduce the search space a little, using transformations. Example:

ps <- makeParamSet(
  makeIntegerVectorParam("hidden", len = 2, lower = 2, upper = 4,
    trafo = function(x) round(10 ^ (x / 2))),
  makeDiscreteParam("rate", values = c(0.1, 0.5))
)

Which uses the values 10 (=10^1), 32 (=10^1.5) and 100 (=10^2) in any combination for hidden layers.

mb706
  • 652
  • 3
  • 5