0

Thanks in advance!

I have time series data that was collected every five minutes, the head() of which looks like…

PumaID CollarID  Type  GMT_Date    GMT_Time  LMT_Date    LMT_Time ActivityX
1    P01     2905 Argos  1/1/2000 12:15:00 AM  1/1/2000 12:00:00 AM         0
2    P01     2905 Argos  1/1/2000 12:25:00 AM  1/1/2000 12:00:00 AM         0
3    P01     2905 Argos  1/1/2000 12:00:00 AM  1/1/2000 12:00:00 AM         0
4    P01     2905 Argos 2/21/2011  9:15:00 PM 2/21/2011  2:15:00 PM         0
5    P01     2905 Argos 2/21/2011  9:20:00 PM 2/21/2011  2:20:00 PM        18
6    P01     2905 Argos 2/21/2011  9:25:00 PM 2/21/2011  2:25:00 PM        14
  ActivityY ActivitySum DayNight Temp
1         0           0    Night   22
2         0           0    Night   22
3         0           0    Night   21
4         0           0      Day   21
5        21          39      Day   20
6        15          29      Day   21

I need to aggregate the ActivitySum column into 12 hour intervals. Using the code below I read in the table, changed the date column to the correct format, and aggregated the data by day.

P01 <- read.csv( "ActDtaP01_ALL_Temp.csv" )
date <- as.Date(P01$GMT_Date, "%m/%d/%Y")
new <- aggregate(P01, by = list(date), mean)

Resulting in this (below). My specific questions are:

     Group.1 PumaID CollarID Type GMT_Date GMT_Time LMT_Date LMT_Time ActivityX
1 2000-01-01     NA     2905   NA       NA       NA       NA       NA  0.000000
2 2011-02-21     NA     2905   NA       NA       NA       NA       NA  8.727273
3 2011-02-22     NA     2905   NA       NA       NA       NA       NA  0.000000
4 2011-02-23     NA     2905   NA       NA       NA       NA       NA  0.000000
5 2011-02-24     NA     2905   NA       NA       NA       NA       NA  0.000000
6 2011-02-25     NA     2905   NA       NA       NA       NA       NA  0.000000
  ActivityY ActivitySum DayNight       Temp
1  0.000000     0.00000       NA 21.6666667
2  9.060606    17.78788       NA 12.6969697
3  0.000000     0.00000       NA -2.8521127
4  0.000000     0.00000       NA -1.4471831
5  0.000000     0.00000       NA  0.3485915
6  0.000000     0.00000       NA  1.3368421

1) How can I further subset this into 12 hr intervals within each day (24 hr period) resulting something like..

Group.1    Group.2  PumaID  CollarID    etc…
2/21/2011   Day      P01       …    
2/21/2011   Night    P01       …    
2/22/2011   Day      P01       …    
2/22/2011   Night    P01      

2) How do I keep all the column values in the data table rather then returning an NA if the FUN argument (mean in this case) could not be computed?

Thanks again!

B. Davis
  • 3,391
  • 5
  • 42
  • 78
  • Try to replace aggregate(P01, by = list(date), mean) with aggregate(P01, by = list(date, DayNight), mean). Also read http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example before posting. It makes things harder if we cannot just copy-paste or load your data. – zero323 Sep 14 '13 at 03:10

1 Answers1

0

You don't have a particularly good dataset to test code since you have only am cases for the first data and pm cases for the second, but this will do a two-way classification by date and am/pm indicator in the time column. I also removed all the non-numeric columns from consideration, since it makes no sense to ask for the mean of a factor.

 new2 <- aggregate(dat[unlist(lapply(dat, is.numeric))], 
                by = list(date, gsub("^.+ ", "", dat$GMT_Time) ), mean)
new2
     Group.1 Group.2 CollarID ActivityX ActivityY ActivitySum     Temp
1 2000-01-01      AM     2905   0.00000         0     0.00000 21.66667
2 2011-02-21      PM     2905  10.66667        12    22.66667 20.66667

The gsub call is removing any character between the beginning of the string and the last instance of a space. Your second request might be best accomplished by adding ID variables to the by list.

> new <- aggregate(dat[unlist(lapply(dat, is.numeric))],
                   by = list(Date=date, 
                             AMPM= gsub("^.+ ", "", dat$GMT_Time),
                             Type=dat$Type ), mean)
> new
        Date AMPM  Type CollarID ActivityX ActivityY ActivitySum     Temp
1 2000-01-01   AM Argos     2905   0.00000         0     0.00000 21.66667
2 2011-02-21   PM Argos     2905  10.66667        12    22.66667 20.66667
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Good thoughts on adding ID variables to the by list! That worked great. – B. Davis Sep 14 '13 at 20:46
  • As long as you want all of the "ID variables" to be in separate strata then adding them to the by list is the right way to go. – IRTFM Sep 14 '13 at 23:46