10

I am using this code:

    mtry <- round(sqrt(18), 0)

gbmGrid <- expand.grid(
              interaction.depth = c(1, 2, 3, 4, 5, 6)
            , n.trees = seq(10, 10000, by = 100)
            , shrinkage = 0.01
            , n.minobsinnode = c(5, 10, 20, 30)
            , distribution = 'gaussian'
            , method = 'gbm'
            , mtry = mtry
    )

    fitControl <- trainControl(
                method = "repeatedcv"
                , number = 2
                , repeats = 3
        )

    gbmFit1 <- train(

                     Y ~

                      X1
                    + X2

                    , data = Train

                    , trControl = fitControl
                    , tuneGrid = gbmGrid
                    , verbose = FALSE
        )

but get:

The tuning parameter grid should have columns mtry

I installed the latest package as some people suggested this and also tried using .mtry. Any ideas? (yes I googled and had a look at SO)

jmuhlenkamp
  • 2,102
  • 1
  • 14
  • 37
cs0815
  • 16,751
  • 45
  • 136
  • 299
  • 1
    You called the column `.mtry` not `mtry` in `expand.grid(..., .mtry = mtry)` Remove the leading dot. – smci Oct 16 '18 at 10:05
  • tried this before - introduced . because of some SO posts ... – cs0815 Oct 16 '18 at 10:23
  • updated question using mtry also does not work. I also updated caret. some people suggested this as solution - same error )-: – cs0815 Oct 18 '18 at 08:47
  • Did you see [this](https://stackoverflow.com/questions/26878334/issues-with-tunegrid-parameter-in-random-forest)? – Sotos Oct 18 '18 at 11:33
  • yes thanks but what does it tell me? the answer with 12 votes also uses: expand.grid(mtry = 100) like me ... – cs0815 Oct 18 '18 at 11:38
  • It's difficult to help without a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), such as a sample of the training data you're referring to – camille Oct 18 '18 at 17:17
  • This does not solve the problem, but, how can you have `mtry = 4`, if you have 2 predictors? Isn't it used in order to randomly select `mtry` number of predictors in each tree? At least in random forest is.. – RLave Oct 19 '18 at 07:32
  • 3
    Still this is a weird error because the method `gbm` doesn't have an `mtry` parameter, not according to this https://topepo.github.io/caret/train-models-by-tag.html#boosting – RLave Oct 19 '18 at 07:33
  • @RLave thanks - I must have got confused with all the modeling techniques I tried. Thanks. – cs0815 Oct 19 '18 at 07:44
  • The confusing error message that caused this question has been improved in the most recent version of `caret` (6.0-81). For more information on this update, refer to my answer below (https://stackoverflow.com/a/53571132/6850554). – jmuhlenkamp Dec 01 '18 at 13:18

2 Answers2

2

I have taken it back to basics (iris). This works - the non existing mtry for gbm was the issue:

library(datasets)
library(gbm)
library(caret)

grid <- expand.grid(
                n.trees = seq(10, 1000, by = 100)
            , interaction.depth = c(4)
            , shrinkage = c(0.01, 0.1)
            , n.minobsinnode = c(5, 10, 20, 30)        
    )

train_control <- trainControl(
                    method = "repeatedcv"
                    , number = 10
                    , repeats = 10
    )

model <- train(Petal.Width ~ Petal.Length
                        , method = 'gbm'
                        , distribution = 'gaussian'
                        , data = iris
                        , trControl = train_control
                        , tuneGrid = grid
                        , verbose = FALSE
    )

model

Sorry for wasting your time!

cs0815
  • 16,751
  • 45
  • 136
  • 299
1

In version >= 6.0-81 of caret the error message for this type of case is more clear. As an example, considering one supplies an mtry in the tuning grid when mtry is not a parameter for the given method.

In caret < 6.0-81, the following error will occur:

# Error: The tuning parameter grid should have columns mtry

In caret >= 6.0-81, the following error will occur:

# Error: The tuning parameter grid should not have columns mtry

Reprex of original confusing error message

And here is a reproducible example demonstrating the improved error message.

caret < 6.0-81

library(caret)
getNamespaceVersion("caret")
## version 
## "6.0-80"

mtry <- round(sqrt(18), 0)
gbmGrid <- expand.grid(
    interaction.depth = c(1, 2, 3, 4, 5, 6)
    , n.trees = seq(10, 10000, by = 100)
    , shrinkage = 0.01
    , n.minobsinnode = c(5, 10, 20, 30)
    , distribution = 'gaussian'
    , method = 'gbm'
    , mtry = mtry
)
fitControl <- trainControl(
    method = "repeatedcv"
    , number = 2
    , repeats = 3
)
gbmFit1 <- train(
    Species ~ Sepal.Length + Sepal.Width
    , data = iris
    , trControl = fitControl
    , tuneGrid = gbmGrid
    , verbose = FALSE
)
# Error: The tuning parameter grid should have columns mtry

caret >= 6.0-81

library(caret)
getNamespaceVersion("caret")
## version 
## "6.0-81"

mtry <- round(sqrt(18), 0)
gbmGrid <- expand.grid(
    interaction.depth = c(1, 2, 3, 4, 5, 6)
    , n.trees = seq(10, 10000, by = 100)
    , shrinkage = 0.01
    , n.minobsinnode = c(5, 10, 20, 30)
    , distribution = 'gaussian'
    , method = 'gbm'
    , mtry = mtry
)
fitControl <- trainControl(
    method = "repeatedcv"
    , number = 2
    , repeats = 3
)
gbmFit1 <- train(
    Species ~ Sepal.Length + Sepal.Width
    , data = iris
    , trControl = fitControl
    , tuneGrid = gbmGrid
    , verbose = FALSE
)
# Error: The tuning parameter grid should not have columns mtry

For more information, refer to the GitHub issue that described and then fixed this behavior: https://github.com/topepo/caret/issues/955

jmuhlenkamp
  • 2,102
  • 1
  • 14
  • 37