0

Error message I got:

not all required variables have been supplied in newdata!

Error in model.frame.default(ff, data = newdata, na.action = na.act) :

variable lengths differ (found for 'i')

Any insights?

My code:

# choose the best # of nodes
oz_gam1 = gam(ozone ~ ns(radiation,1)+ns(temperature,1)+ns(wind,1),data = train)
gam_train_pred1 = predict(oz_gam1, train)
smallest_train = mean((train$ozone - gam_train_pred1)^2)
smallest_i = 1
smallest_j = 1
smallest_k = 1

for (i in 1:10){
  for (j in 1:10){
    for (k in 1:10){
      oz_gam = gam(ozone ~ ns(radiation,i)+ns(temperature,j)+ns(wind,k), data = train)
      gam_train_pred = predict(oz_gam, train)
      gam_train = mean((train$ozone-gam_train_pred)^2)
      if (gam_train < smallest_train){
        smallest_train = gam_train
        smallest_i = i
        smallest_j = j
        smallest_k = k
      } # if
    } # k
  } # j
} # i
smallest_i
smallest_j
smallest_k
Tianyu Jiang
  • 73
  • 1
  • 6
  • 1
    Can you include the beginning of your code including the dataset? Ideally you make it so we can [run your code](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – AidanGawronski Feb 29 '20 at 02:47
  • Just a stab in the dark since you haven't provided any data. Are there any NA in your "train data frame? What happens if you include `na.rm=TRUE` in your `mean(...)` functions? – Edward Feb 29 '20 at 04:51
  • Thanks! I realized that it is not the best way to approach the problem. As for the dataset, I was using environmental from https://docs.tibco.com/pub/enterprise-runtime-for-R/4.1.0/doc/html/Language_Reference/Sdatasets/trellis.datasets.html – Tianyu Jiang Mar 04 '20 at 02:54
  • I ended up choosing the best # of nodes using cross validation: oz_gam_cv = train(x = oz_train[,-1], y = oz_train$ozone, method = "gamSpline", trControl = trainControl(method = "cv", number = 10), tuneGrid = expand.grid(df = 1:10)) – Tianyu Jiang Mar 04 '20 at 02:55

0 Answers0