0

i'm getting this warning while trying to predict values for my test data frame. here is my code to build the tree and predict:

library(pgmm)
data(olive)
olive = olive[,-1]
tree2 <- tree(olive$Area ~ olive$Palmitic + olive$Palmitoleic+olive$Stearic+olive$Oleic+olive$Linoleic+olive$Linolenic+olive$Arachidic+olive$Eicosenoic,data=olive)
newdata = as.data.frame(t(colMeans(olive)))
pred1 <- predict(tree2,newdata)

i read a similar post here so i replaced this line

newdata = as.data.frame(t(colMeans(olive)))

by

aa<-t(colMeans(olive))
aa[1,1]
newdata <- data.frame(Palmitic=aa[1,1],Palmitoleic=aa[1,2],Stearic=aa[1,3],Oleic=aa[1,4],Linoleic=aa[1,5],Linolenic=aa[1,6],Arachidic=aa[1,7],Eicosenoic=aa[1,8])

code to names columns of my dataset but i'm still getting the same warning and prediction is wrong :-/

Community
  • 1
  • 1

1 Answers1

1

(Upgraded from a comment.)

Try eliminating the $ from your model:

tree2 <- tree(Area ~ Palmitic + Palmitoleic+Stearic+Oleic+
    Linoleic+Linolenic+Arachidic+Eicosenoic,data=olive)

In principle, you can further simplify this to

tree(Area~.-Region,data=olive)

where . specifies "all other variables in the data set", and -Region says you don't want to include the Region variable. (Oops, this doesn't actually work -- although I think it should)

The basic issue is that predict is trying to look within newdata for the names of the predictor variables specified in the original model: it needs to be looking for predvar, not origdata$predvar.

I would use:

predict(tree3,newdata=as.data.frame(rbind(colMeans(olive[-(1:2)]))))
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453