0

I am a new user of R and am more used to the loops functions with programs such as Matalab, but given R's strengths and weaknesses I am attempting to shy away from loops in favor of apply functions. The only problem is that I am not 100% sure when one should be favored over the other, for instance in the following scenario:

I have a series of data which I would like to convert to individual plots. I would like these plots to zoom in on a region of the plot by locating a specific point and then determining the xlim and ylim of the plot so that the graph only contains 80 or less points. This may or may not be the worse way to go about it, but it has worked.

And now for the code ((in German) Weg = distance , Kraft = Strain, Zeit = time):

#reads the data
mydata <- read.table(file, header = TRUE, skip=52, dec=",")

#establishes what segments of mydata are what

kraft  <- mydata[,2]
weg    <- mydata[,3]
zeit   <- mydata[,1]

#Finds the distance values associated with the maximum force in the plot, which is the area in which I am interested

Weg_Values_at_Fmax <- weg[which(kraft == max(kraft))]


#the next sets of lines initiate the values which will be changed in the while loop
#one to zero, the other to a range of distance values on either side of the distance #values associated with the maximum force
n <- 0

Weg.index <- which((weg >= Weg_Values_at_Fmax[length(Weg_Values_at_Fmax)] - 
                    Weg_Values_at_Fmax[length(Weg_Values_at_Fmax)]*(1/7)  + n) & 
                    weg <= (Weg_Values_at_Fmax[length(Weg_Values_at_Fmax)] + 
                    Weg_Values_at_Fmax[length(Weg_Values_at_Fmax)]*(1/7) - n))      

#finally the while loop which increase n (thereby decreasing Weg.index) until the #Weg.index falls below a certain value, in this case 80 points #                                                                                                                                                                     
while(length(Weg.index) > 80){

    n <- n + .0005
    Weg.index <- which((weg >= Weg_Values_at_Fmax[length(Weg_Values_at_Fmax)] - 
                        Weg_Values_at_Fmax[length(Weg_Values_at_Fmax)]*(1/6)  + n) & 
                        weg <= (Weg_Values_at_Fmax[length(Weg_Values_at_Fmax)] +
                        Weg_Values_at_Fmax[length(Weg_Values_at_Fmax)]*(1/6) - n))

}

Then comes the plotting:

plot(weg, kraft, 
     xlim=c(weg[Weg.index[1]], weg[Weg.index[length(Weg.index)]]), 
     ylim=c(min(kraft[Weg.index[1:length(Weg.index)]]),
            max(kraft[Weg.index[1:length(Weg.index)]])),
     main = file)

This code works fine as there is not too much data, but I am hoping that there is a more efficient way to handle such a data request with a while loop for when I need to crunch larger data. I hope that this question was specific enough and that it isn't too off topic or anything else like that. Thank you for your time and help.

joran
  • 169,992
  • 32
  • 429
  • 468
  • I should also mention that, yes, I have attempted to answer this question with my own research, but, no, I could not find anything completely satisfactory http://nsaunders.wordpress.com/2010/08/20/a-brief-introduction-to-apply-in-r/ http://stackoverflow.com/questions/6342902/for-loops-vs-apply-templates etc – Mason Bowen Oct 17 '14 at 14:37
  • Do you have any appropriate data set which we can use to demonstrate our answer? – Marco Oct 17 '14 at 14:54
  • yes, but I am so new to this site I am not entirely sure what would be the best way to share it, do you have any recommendations? Thank you. also it's saved as a txt file on my computer – Mason Bowen Oct 17 '14 at 15:04
  • Either as R code using for example `matrix`, `data.frame` or `structure` or, if it is too much data, as a weblink. – Marco Oct 17 '14 at 15:08
  • ok will do just gimme a second, tried to cut and paste but uh that went about as well as expected – Mason Bowen Oct 17 '14 at 15:14
  • use `dput()` if its little enough data – Marco Oct 17 '14 at 15:16
  • the problem is that the data is about 1300 rows and 6 colums and when I try to use the insert code function while editing my original comment everything gets distorted (again I am cutting and pasting from txt doc) – Mason Bowen Oct 17 '14 at 15:26
  • Would it be possible to shrink the data for demonstration purposes, e.g. `mydata[1:10, 1:3]`? – Marco Oct 17 '14 at 15:28
  • yes but even when I only have say 300 data points, I'm still too incompetent to upload the data X( I will try Github one more time and see if it works – Mason Bowen Oct 17 '14 at 15:31
  • please try this link: https://github.com/MasonBowen/Data-Example/commit/b1cd70b5832f8127a437b77f71a9b84b55ad0aba I should also add that my original data used commas as decimal indicators and that I converted it back to decimals – Mason Bowen Oct 17 '14 at 15:42

1 Answers1

0

I took the approach of keeping the data together in a data frame, and using the percentile() function and its inverse ecdf() to discover the desired range of values. The code below should not depend on any order of the data. It should also behave okay if there are multiple records with the same "spotlight" kraft value. Note the spotlight is the first of the matching values.

plot.points=80

#call spotlight the index of row to highlight
spotlight.kraft <-  max(mydata$Kraft_N)
spotlight <- which(mydata$Kraft_N == spotlight.kraft)[1]

#get the quantiles of the values above/below spotlight.kraft
x.p = c(ecdf(mydata$Weg_mm)(mydata$Weg_mm[spotlight]) - 0.5 *plot.points/nrow(mydata),
        ecdf(mydata$Weg_mm)(mydata$Weg_mm[spotlight]) + 0.5 *plot.points/nrow(mydata))

#adjust for quantiles outside [0,1]
if(x.p[1]<0) x.p <- x.p -x.p[1]
if(x.p[2]>1) x.p <- x.p - x.p[2] + 1

xlims <- quantile(mydata$Weg_mm,x.p)

#possibly not exactly 80 points shown
nrow(subset(mydata, Weg_mm > xlims[1] & Weg_mm<xlims[2]))
#get corresponding kraft values
ylims <- range(subset(mydata, Weg_mm > xlims[1] & Weg_mm<xlims[2])$Kraft_N)

plot(weg, kraft,  xlim=xlims, ylim=ylims)

No need to use loops to do it.

As an aside, your original code used this construction quite a bit.

Weg_Values_at_Fmax[length(Weg_Values_at_Fmax)]

By definition of the max() function in R, it is always a vector of length one. So this expression will always return just the scalar value of Weg_Values_at_Fmax.

vpipkt
  • 1,710
  • 14
  • 17