0

I've successfully completed an analysis in rpart, where I have 0-1 outcome data, where I have weighted the data to deal with the problem of a scarce response. When I plot the data using prp, I want the labels to have the true proportion, rather than the weighted proportion. Is this possible?

A sample data set below (note that I am working with many more factors than I'm using here!)

require(rpart)
require(rpart.plot)
set.seed(1001)
x<-rnorm(1000)
y<-rbinom(1000,size=1,prob=1/(1+exp(-x)))
z<-10+rnorm(1000)
weights<-ifelse(y==0,1,z)

rpartfun<-rpart(y~x,
weights=z,method="class",control=list(cp=0))

rparttrim<- prune(rpartfun,cp=rpartfun$cptable[which.min(rpartfun$cptable[,"xerror"]),"CP"])
prp(rparttrim,extra=104)

[I would produce the image I get from that here, but I don't have enough reputation]

Where I would like that first node (and indeed,all the nodes!) to, instead of having .28 to .72 (the weighted proportions), have 0.65 to 0.35 (the true proportion).

  • The variable `claim_fponly` is not defined. Please make sure your example is [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – MrFlick Jun 16 '15 at 15:21
  • You're right, apologies, I've now edited it out. I didn't notice because it was sitting in my workspace (I copy and pasted from the actual code and didn't notice I'd left that in). – Kieran Martin Jun 17 '15 at 09:05

0 Answers0