0

I have a data set (let's call it "x") that looks like this...

x <- structure(list(top_bracket_style = c("29L", "23R", "29L", "29R", 
"29R", "29L", "29R", "29R", "29R", "29R", "29R", "29R", "29R", 
"29L", "90"), column_finish = c("PT", "AW", "PT", "PT", "ML", 
"PT", "ML", "PT", "ML", "PT", "PT", "ML", "PT", "PT", "PT"), 
    foot_finish = c("PT", "AW", "PT", "PT", "ML", "PT", "ML", 
    "PT", "ML", "PT", "PT", "ML", "PT", "PT", "PT"), glide_style = c("S", 
    "S", "S", "S", "S", "S", "S", "S", "L", "S", "S", "S", "S", 
    "S", "S"), cycle_time = c(73L, 148L, 137L, 132L, 139L, 129L, 
    198L, 110L, 116L, 138L, 130L, 138L, 97L, 132L, 170L)), .Names = c("top_bracket_style", 
"column_finish", "foot_finish", "glide_style", "cycle_time"), row.names = c(NA, 
15L), class = "data.frame")

The data describe a furniture product that is made in a plant on a certain machine. Top bracket style, column finish, foot finish, and glide style are the four characteristics that describe the distinct configuration of options for the product build. Cycle time is the amount of time it takes to build the product, from start to finish.

I use

fit <- rpart(cycle_time ~ top_bracket_style + column_finish + 
   foot_finish + glide_style, method = "anova", data = x)

to partition the data so that I can identify groups/clusters that have similar mean cycle times. When I "print(fit)" I get the following results...

1) root 16933 21747274.710800 134.1567944251
  2) top_bracket_style=23L,23R,29L,29R 15591 18965219.863130 132.0181514977 *
  3) top_bracket_style=120,35L,35R,90 1342  1882283.988077 159.0029806259 *

What I want to accomplish seems simple, but I cannot find a way to accomplish it, even with extending searching on CRAN and Stack Overflow. I would like to transform the rpart results into a data frame that looks like this...

enter image description here

This data frame would serve as a lookup table for entering cycle time into our production database responsible for scheduling our machines in our plants. We are not at the point, yet, where we would want to use the decision tree to predict cycle times (with the predict() function). For now, it will be more controlled than that. We are content to collect data off the machines that are already making these products, and improve our cycle time calculations as we go.

Any help would be appreciated.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Eric
  • 3
  • 2
  • Can you put your data into a reproducible form? The image helps but it's easier to help you if you use dput() or some other way to allow others to reproduce what you've done. – BLT Apr 07 '17 at 15:41
  • 1
    Don't post pictures of data. See [how to create a reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for suggestions in on including data in the question itself. – MrFlick Apr 07 '17 at 16:13
  • Apologies on the lack of reproducible reproducible form. Post has been edited to include reproducible form. – Eric Apr 07 '17 at 18:00
  • It doesn't seem like the sample data gives the same split. It just returns a root so it's hard to work with. – MrFlick Apr 07 '17 at 18:55

0 Answers0