0

I am using Bayesian optimization to tune the parameters of SVM for regression problem. In the following code, what should be the value of init_grid_dt = initial_grid ? I got the upper and lower bounds of the sigma and C parameters of SVM, but dont know what should be the initial-grid?

In one of the example on the web, they took a random search results as input to the initial grid. The code is as follow:

ctrl <- trainControl(method = "repeatedcv", repeats = 5)

svm_fit_bayes <- function(logC, logSigma) {
   ## Use the same model code but for a single (C, sigma) pair. 
   txt <- capture.output(
     mod <- train(y ~ ., data = train_dat,
                  method = "svmRadial",
                  preProc = c("center", "scale"),
                  metric = "RMSE",
                  trControl = ctrl,
                  tuneGrid = data.frame(C = exp(logC), sigma = exp(logSigma)))
  )
list(Score = -getTrainPerf(mod)[, "TrainRMSE"], Pred = 0)
 }
lower_bounds <- c(logC = -5, logSigma = -9)
 upper_bounds <- c(logC = 20, logSigma = -0.75)
 bounds <- list(logC = c(lower_bounds[1], upper_bounds[1]),
                logSigma = c(lower_bounds[2], upper_bounds[2]))

## Create a grid of values as the input into the BO code
 initial_grid <- rand_search$results[, c("C", "sigma", "RMSE")]
 initial_grid$C <- log(initial_grid$C)
 initial_grid$sigma <- log(initial_grid$sigma)
 initial_grid$RMSE <- -initial_grid$RMSE
 names(initial_grid) <- c("logC", "logSigma", "Value")

library(rBayesianOptimization)

    ba_search <- BayesianOptimization(svm_fit_bayes,
                                       bounds = bounds,
                                       init_grid_dt = initial_grid, 
                                       init_points = 0, 
                                       n_iter = 30,
                                       acq = "ucb", 
                                       kappa = 1, 
                                       eps = 0.0,
                                       verbose = TRUE)
camille
  • 16,432
  • 18
  • 38
  • 60
Neha gupta
  • 43
  • 5
  • Wouldn't that depend on your data, which we don't have? This should likely be moved to [stats.se] since it's about statistical methods more than programming – camille Nov 06 '19 at 14:43
  • No its programming question.. I am asking about the details of a parameter of the function.. In the example on web, they used init grid as follow where rand_search is the result of random search they applied in an example previously : initial_grid <- rand_search$results[, c("C", "sigma", "RMSE")] > initial_grid$C <- log(initial_grid$C) > initial_grid$sigma <- log(initial_grid$sigma) > initial_grid$RMSE <- -initial_grid$RMSE > names(initial_grid) <- c("logC", "logSigma", "Value") – Neha gupta Nov 06 '19 at 14:57
  • You can [edit] the question to include more code where it will be more legible. But again, wouldn't this depend on your data? [See here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on making a reproducible example – camille Nov 06 '19 at 15:19
  • I edited the question and included the code. Thank you – Neha gupta Nov 06 '19 at 15:27

0 Answers0