0

I asked a question like this before but I decided to simplify my data format because I'm very new at R and didnt understand what was going on....here's the link for the question How to handle more than multiple sets of data in R programming?

But I edited what my data should look like and decided to leave it like this..in this format...

X1.0   X X2.0 X.1
   0.9 0.9  0.2 1.2
  1.3 1.4  0.8 1.4

As you can see I have four columns of data, The real data I'm dealing with is up to 2000 data points.....Columns "X1.0" and "X2.0" refer "Time"...so what I want is the average of "X" and "X.1" every 100 seconds based on my 2 columns of time which are "X1.0" and "X2.0"...I can do it using this command

cuts <- cut(data$X1.0, breaks=seq(0, max(data$X1.0)+400, 400))
   by(data$X, cuts, mean)

But this will only give me the average from one set of data....which is "X1.0" and "X".....How will I do it so that I could get averages from more than one data set....I also want to stop having this kind of output

cuts: (0,400]
[1] 0.7
------------------------------------------------------------ 
cuts: (400,800]
[1] 0.805

Note that the output was done every 400 s....I really want a list of those cuts which are the averages at different intervals...please help......I just used data=read.delim("clipboard") to get my data into the program

Community
  • 1
  • 1
  • possible duplicate of [R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate vs](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega) – mnel Feb 21 '13 at 04:33
  • The biggest problem for you is the structure of your data. Use `dump(head(data, 10), "")` and post it in here. Whenever you place a question, be sure we can reproduce it. Share a bit of your data and we'll deal with it. Afterwards, [this post](http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega) can help you, just as @mnel suggested. – Oscar de León Feb 21 '13 at 12:55

1 Answers1

2

It is a little bit confusing what output do you want to get.

First I change colnames but this is optional

colnames(dat) <- c('t1','v1','t2','v2')

Then I will use ave which is like by but with better output. I am using a trick of a matrix to index column:

matrix(1:ncol(dat),ncol=2)  ## column1 is col1 adn col2...
     [,1] [,2]
[1,]    1    3
[2,]    2    4

Then I am using this matrix with apply. Here the entire solution:

cbind(dat,
      apply(matrix(1:ncol(dat),ncol=2),2,
     function(x,by=10){      ## by 10 seconds! you can replace this 
                             ## with 100 or 400 in you real data
     t.col <- dat[,x][,1]   ## txxx
     v.col <- dat[,x][,2]   ## vxxx
     ave(v.col,cut(t.col, 
                   breaks=seq(0, max(t.col),by)),
         FUN=mean)})
  )

EDIT correct the cut and simplify the code

cbind(dat,
     apply(matrix(1:ncol(dat),ncol=2),2,
           function(x,by=10)ave(dat[,x][,1], dat[,x][,1] %/% by)))
   X1.0   X X2.0 X.1       1         2
1   0.9 0.9  0.2 1.2  3.3000  3.991667
2   1.3 1.4  0.8 1.4  3.3000  3.991667
3   2.0 1.7  1.6 1.1  3.3000  3.991667
4   2.6 1.9  2.2 1.6  3.3000  3.991667
5   9.7 1.0  2.8 1.3  3.3000  3.991667
6  10.7 0.8  3.5 1.1 12.8375  3.991667
7  11.6 1.5  4.1 1.8 12.8375  3.991667
8  12.1 1.4  4.7 1.2 12.8375  3.991667
9  12.6 1.8  5.4 1.2 12.8375  3.991667
10 13.2 2.1  6.3 1.3 12.8375  3.991667
11 13.7 1.6  6.9 1.1 12.8375  3.991667
12 14.2 2.2  9.4 1.3 12.8375  3.991667
13 14.6 1.8 10.0 1.5 12.8375 10.000000
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • Thanks for the answer agstudy....but when I'm typing this code...is it all in a single line?? – Marco De Niro Feb 21 '13 at 04:42
  • @MarcoDeNiro there was a browser line in my solution. I remove it. you copy and paste the line beginning with `cbind` this should work if you data.frame is called dat – agstudy Feb 21 '13 at 04:46
  • thanks again...but would u be able to tell me how I'm getting this error Error: unexpected symbol in "cbind(data,apply(matrix(1:ncol(data),ncol=2),2,function(x,by=10){t.col <- data[,x][,1]v.col" – Marco De Niro Feb 21 '13 at 05:02
  • @MarcoDeNiro I add my dat..test the solution with this data. – agstudy Feb 21 '13 at 05:06
  • I'm getting a new error this time which is so weird: Error in cut(t.col, breaks = seq(0, max(t.col), by)) : object 't.col' not found – Marco De Niro Feb 21 '13 at 05:36
  • 1
    @agstudy, you should modify the "breaks" in your `cut` argument. Maybe something like `breaks = seq(0, signif(max(t.col), digits = 1) + by`. Notice that rows 6-13 in your current output for the first column are the same as the input because the `cut` that you use doesn't capture them as belonging in the same group. – A5C1D2H2I1M1N2O1R2T1 Feb 21 '13 at 12:19
  • @AnandaMahto thanks. This part is really confusing in the question of the OP. He talks about every 100s then he writes 400 ... Whatever, I simplify my answer. – agstudy Feb 21 '13 at 13:03