I have a data set (let's call it "x") that looks like this...
x <- structure(list(top_bracket_style = c("29L", "23R", "29L", "29R",
"29R", "29L", "29R", "29R", "29R", "29R", "29R", "29R", "29R",
"29L", "90"), column_finish = c("PT", "AW", "PT", "PT", "ML",
"PT", "ML", "PT", "ML", "PT", "PT", "ML", "PT", "PT", "PT"),
foot_finish = c("PT", "AW", "PT", "PT", "ML", "PT", "ML",
"PT", "ML", "PT", "PT", "ML", "PT", "PT", "PT"), glide_style = c("S",
"S", "S", "S", "S", "S", "S", "S", "L", "S", "S", "S", "S",
"S", "S"), cycle_time = c(73L, 148L, 137L, 132L, 139L, 129L,
198L, 110L, 116L, 138L, 130L, 138L, 97L, 132L, 170L)), .Names = c("top_bracket_style",
"column_finish", "foot_finish", "glide_style", "cycle_time"), row.names = c(NA,
15L), class = "data.frame")
The data describe a furniture product that is made in a plant on a certain machine. Top bracket style, column finish, foot finish, and glide style are the four characteristics that describe the distinct configuration of options for the product build. Cycle time is the amount of time it takes to build the product, from start to finish.
I use
fit <- rpart(cycle_time ~ top_bracket_style + column_finish +
foot_finish + glide_style, method = "anova", data = x)
to partition the data so that I can identify groups/clusters that have similar mean cycle times. When I "print(fit)" I get the following results...
1) root 16933 21747274.710800 134.1567944251
2) top_bracket_style=23L,23R,29L,29R 15591 18965219.863130 132.0181514977 *
3) top_bracket_style=120,35L,35R,90 1342 1882283.988077 159.0029806259 *
What I want to accomplish seems simple, but I cannot find a way to accomplish it, even with extending searching on CRAN and Stack Overflow. I would like to transform the rpart
results into a data frame that looks like this...
This data frame would serve as a lookup table for entering cycle time into our production database responsible for scheduling our machines in our plants. We are not at the point, yet, where we would want to use the decision tree to predict cycle times (with the predict()
function). For now, it will be more controlled than that. We are content to collect data off the machines that are already making these products, and improve our cycle time calculations as we go.
Any help would be appreciated.