I'd like to build decision trees for multiple data sets, so to make life easier I'd like to make a UDF.
I'm using the cu.summary data set from rpart() for my MWE. Mileage is predicted by price, country, reliability, and type.
I've borrowed code (from https://www.statmethods.net/advstats/cart.html) that works well when modelling a single data frame like cu.summary.
Continuing with cu.summary, I split it into two data frames, cu.summaryADC and cu.summaryNJD. I then tried to pass cu.summaryADC and cu.summaryNJD into my UDF function to make my decision trees.
The problem's the title of the decision trees. I'd like the title of the plot to include the name of the data frame used to generate it, eg "Regression Tree for Mileage cu.summaryADC" and "Regression Tree for Mileage cu.summaryNJD".
My attempt at a general expression main= paste("Regression Tree for Mileage ", names(DTList))
doesn't work.
Thanks for any help!
Attempt at code for multiple data frames:
#LOAD PACKAGE
library(rpart)
# PREPARE MWE DATA BY SPLITTING IT INTO TWO DATA FRAMES
dflist <- split(cu.summary, rep(1:2, length.out = nrow(cu.summary), each = ceiling(nrow(cu.summary)/2)))
cu.summaryADC <- dflist[[1]]
cu.summaryNJD <- dflist[[2]]
# MAKE FUNCTION
DTMakerFun <- function(x){
fit <- rpart(Mileage~Price + Country + Reliability + Type,
method="anova", data=x)
fitplot <- plot(fit, uniform=TRUE,
main= paste("Regression Tree for Mileage ", names(DTList)))
text(fit, use.n=TRUE, all=TRUE, cex=.8)
return(fitplot)
}
# APPLY FUNCTION
DTList <- lapply(list(cu.summaryADC,cu.summaryNJD), DTMakerFun)