0

So I started with 5 data frames corresponding to different categories. Let's call them d1,d2,d3,d4,d5 defined with the code:

d1<-data.frame(runif(1000,0,10000))
d2<-data.frame(runif(1000,0,10000)) 
d3<-data.frame(runif(1000,0,10000)) 
d4<-data.frame(runif(1000,0,10000)) 
d5<-data.frame(runif(1000,0,10000))

I combined these five data frames into one huge data frame:

all_data<-data.frame(d1, d2, d3, d4, d5)

I then converted this big data frame into a vector to use with ecdfPlot with the code:

all_data_v<-as.vector(t(all_data))

I then created an ecdf plot on a log-log scale:

ecdfPlot(all_data_v,log="yx",xlim=c(0.01,1000),ylim=c(0.001,1))

I am looking at the points less than or equal to 1, specifically trying to determine the percentage of points present from each data frame in that range. My question is: Is there any way I can separate the points less than or equal to 1 and track them back to their original data frame? In other words, find the points less than or equal to 1 and determine whether they came from d1, d2, d3, d4 or d5?

I have tried to add the plot.it=FALSE argument to ecdfPlot which returns the points that it plots, but it doesn't tell me where the points come from.

Any help would be greatly appreciated.

sarwoz
  • 13
  • 2
  • 1
    `data.frame(data.frame(runif(2)),data.frame(runif2))` is ... such a horrible way to make sample data (well, *any* data). How about `set.seed(2); as.data.frame(matrix(runif(1000*5), nc=5))`? – r2evans Jul 26 '19 at 23:18
  • 1
    Where is `ecdfPlot` defined? – r2evans Jul 26 '19 at 23:20
  • EnvStats package – kstew Jul 26 '19 at 23:24
  • So are you trying to filter out values <= 1 before making the ecdf plot? It's unclear what you are trying to achieve. – kstew Jul 26 '19 at 23:25
  • No, I'm trying to filter out the x=1 values after making the ecdf plot and finding out where they originally came from – sarwoz Jul 28 '19 at 19:47

0 Answers0