0

This is my dataframe SHORT:

ID    IDaxis    Y   Date-Time       Tdiff  
1   1   5   2012-06-11 13:10:30 0.00    
1   1   10  2012-06-11 15:10:30 2.00    
1   1   20  2012-06-11 17:10:30 2.00    
1   3   15  2012-06-11 13:20:30 0.00    
1   3   30  2012-06-11 14:20:30 1.00    
1   3   45  2012-06-11 17:20:30 3.00    
1   6   9   2012-06-11 13:35:30 0.00        
1   6   15  2012-06-11 15:35:30 2.00    
1   6   30  2012-06-11 18:35:30 3.00    
3   2   8   2012-06-11 13:50:30 0.00  
3   2   14  2012-06-11 14:55:30 1.083   
3   2   20  2012-06-11 16:55:30 2.00    
3   2   30  2012-06-11 19:00:30 2.083   
3   5   10  2012-06-11 13:40:30 0.00    
3   5   15  2012-06-11 16:45:30 3.083   

ID - plant
IDaxis - plant leaf
Y - length of leaf
Date - Time - date and time of measurement
Tdiff - time(h) interval between measurement

I want to do (Example SHORT1):
1) sum up Tdiff for IDaxis in column SHORT$Ttot
2) calculate difference between row in Y for IDaxis in column SHORT$Ydiff
3) sum up Ydiff for IDaxis in column SHORT$Ytot

Example SHORT1:

enter image description here

Ydiff - length interval between measurement
Ytot - sum of length interval from measurement to measurement
Ttot - sum of time interval from measurement to measurement

I know how to calculate this for IDaxis if I split dataframe. My problem is that I have three dataframe each 700 ID, each have 100 IDaxis. I don't know how to do it automaticly for whole dataframe. Thank You in advance.

joran
  • 169,992
  • 32
  • 429
  • 468
barley81
  • 37
  • 4
  • 3
    You are looking for the functions `diff` and `cumsum` and `plyr::ddply` (or one of its contenders). If you had [presented your data in a way I could copy into my R session](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) I would have shown you how to use these functions. – Roland Dec 13 '12 at 11:45
  • Thank you, it would be great.I am starting to edit my data in the way that allows you to copy my data frame. – barley81 Dec 13 '12 at 14:39
  • This is a mess. How I can do this better? – barley81 Dec 13 '12 at 14:59
  • 1
    Plenty of code for just this problem already: http://stackoverflow.com/questions/7790197/mean-of-one-column-based-of-level-of-other-columns-in-loops http://stackoverflow.com/questions/9382436/calculate-conditional-means-only-based-on-one-column-in-r http://stackoverflow.com/questions/5648763/how-to-calculate-ranking-of-one-column-based-on-groups-defined-by-another-column – Roman Luštrik Dec 13 '12 at 15:12

2 Answers2

1

You can use ave:

SHORT$Ttot  <- ave(SHORT$Tdiff, SHORT$IDaxis, FUN = cumsum),
SHORT$Ydiff <- ave(SHORT$Y,     SHORT$IDaxis, FUN = diff),
SHORT$Ytot  <- ave(SHORT$Ydiff, SHORT$IDaxis, FUN = cumsum),

(if you don't like the repeated SHORT$, look at functions like transform.)

You can also use the convenient plyr package:

library(plyr)
ddply(SHORT, "IDaxis", transform, Ttot  = cumsum(Tdiff),
                                  Ydiff = diff(Y),
                                  Ytot  = cumsum(Ydiff))
flodel
  • 87,577
  • 21
  • 185
  • 223
  • Great. It is working !!! Thank you very much. I start my work with R around 4 weeks ago, but I have lot of other things on my had. I will have time to focus only on R at January and February. – barley81 Dec 13 '12 at 15:29
0

Use aggregate() and merge().

Here is a link to aggregate http://www.statmethods.net/management/aggregate.html

Dirk N
  • 717
  • 3
  • 9
  • 23