So I started with 5 data frames corresponding to different categories. Let's call them d1,d2,d3,d4,d5
defined with the code:
d1<-data.frame(runif(1000,0,10000))
d2<-data.frame(runif(1000,0,10000))
d3<-data.frame(runif(1000,0,10000))
d4<-data.frame(runif(1000,0,10000))
d5<-data.frame(runif(1000,0,10000))
I combined these five data frames into one huge data frame:
all_data<-data.frame(d1, d2, d3, d4, d5)
I then converted this big data frame into a vector to use with ecdfPlot
with the code:
all_data_v<-as.vector(t(all_data))
I then created an ecdf plot on a log-log scale:
ecdfPlot(all_data_v,log="yx",xlim=c(0.01,1000),ylim=c(0.001,1))
I am looking at the points less than or equal to 1, specifically trying to determine the percentage of points present from each data frame in that range. My question is: Is there any way I can separate the points less than or equal to 1 and track them back to their original data frame? In other words, find the points less than or equal to 1 and determine whether they came from d1, d2, d3, d4 or d5
?
I have tried to add the plot.it=FALSE
argument to ecdfPlot
which returns the points that it plots, but it doesn't tell me where the points come from.
Any help would be greatly appreciated.