2

I am trying to understand how rpart works in a project that I am trying to complete. I am relatively new to R but I have a lot of experience using SAS to build a variety of analytical models.

First I ran this piece of code

mtree1 <- rpart(X17~., data = mydata, method="class", control = rpart.control(minsplit = 20, minbucket = 7, maxdepth = 10, usesurrogate = 2, xval =10 ))

I get a tree with X12 as the top split, X10 is the next split on the LHS, X69 on the RHS, and then X68 and X70 on that branch.

Next I ran the following piece

mtree1 <- rpart(X17~ X12+X10+X69+X68+X70, data = mydata, method="class", control = rpart.control(minsplit = 20, minbucket = 7, maxdepth = 10, usesurrogate = 2, xval =10 ))

I get the exact same tree

Finally I ran this

mtree1 <- rpart(X17~ X12+X69+X68+X70, data = mydata, method="class", control = rpart.control(minsplit = 20, minbucket = 7, maxdepth = 10, usesurrogate = 2, xval =10 ))

Now I get no splits at all. (BTW, my data set has 234144 observations & 90 independent variables with 210205 goods & 23839 bads.)

Here is an image of the code and output

enter image description here

What is the reason for this? I would appreciate any help. Thanks. KK

0 Answers0