Trying to aggregate (and average) time series data (collected every five minutes) into day and night periods for each 24 hr period in R

Question

Thanks in advance!

I have time series data that was collected every five minutes, the head() of which looks like…

PumaID CollarID  Type  GMT_Date    GMT_Time  LMT_Date    LMT_Time ActivityX
1    P01     2905 Argos  1/1/2000 12:15:00 AM  1/1/2000 12:00:00 AM         0
2    P01     2905 Argos  1/1/2000 12:25:00 AM  1/1/2000 12:00:00 AM         0
3    P01     2905 Argos  1/1/2000 12:00:00 AM  1/1/2000 12:00:00 AM         0
4    P01     2905 Argos 2/21/2011  9:15:00 PM 2/21/2011  2:15:00 PM         0
5    P01     2905 Argos 2/21/2011  9:20:00 PM 2/21/2011  2:20:00 PM        18
6    P01     2905 Argos 2/21/2011  9:25:00 PM 2/21/2011  2:25:00 PM        14
  ActivityY ActivitySum DayNight Temp
1         0           0    Night   22
2         0           0    Night   22
3         0           0    Night   21
4         0           0      Day   21
5        21          39      Day   20
6        15          29      Day   21

I need to aggregate the ActivitySum column into 12 hour intervals. Using the code below I read in the table, changed the date column to the correct format, and aggregated the data by day.

P01 <- read.csv( "ActDtaP01_ALL_Temp.csv" )
date <- as.Date(P01$GMT_Date, "%m/%d/%Y")
new <- aggregate(P01, by = list(date), mean)

Resulting in this (below). My specific questions are:

     Group.1 PumaID CollarID Type GMT_Date GMT_Time LMT_Date LMT_Time ActivityX
1 2000-01-01     NA     2905   NA       NA       NA       NA       NA  0.000000
2 2011-02-21     NA     2905   NA       NA       NA       NA       NA  8.727273
3 2011-02-22     NA     2905   NA       NA       NA       NA       NA  0.000000
4 2011-02-23     NA     2905   NA       NA       NA       NA       NA  0.000000
5 2011-02-24     NA     2905   NA       NA       NA       NA       NA  0.000000
6 2011-02-25     NA     2905   NA       NA       NA       NA       NA  0.000000
  ActivityY ActivitySum DayNight       Temp
1  0.000000     0.00000       NA 21.6666667
2  9.060606    17.78788       NA 12.6969697
3  0.000000     0.00000       NA -2.8521127
4  0.000000     0.00000       NA -1.4471831
5  0.000000     0.00000       NA  0.3485915
6  0.000000     0.00000       NA  1.3368421

1) How can I further subset this into 12 hr intervals within each day (24 hr period) resulting something like..

Group.1    Group.2  PumaID  CollarID    etc…
2/21/2011   Day      P01       …    
2/21/2011   Night    P01       …    
2/22/2011   Day      P01       …    
2/22/2011   Night    P01

2) How do I keep all the column values in the data table rather then returning an NA if the FUN argument (mean in this case) could not be computed?

Thanks again!

Try to replace aggregate(P01, by = list(date), mean) with aggregate(P01, by = list(date, DayNight), mean). Also read http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example before posting. It makes things harder if we cannot just copy-paste or load your data. — zero323, Sep 14 '13 at 03:10

IRTFM · Accepted Answer · 2013-09-14T17:25:57.803

You don't have a particularly good dataset to test code since you have only am cases for the first data and pm cases for the second, but this will do a two-way classification by date and am/pm indicator in the time column. I also removed all the non-numeric columns from consideration, since it makes no sense to ask for the mean of a factor.

 new2 <- aggregate(dat[unlist(lapply(dat, is.numeric))], 
                by = list(date, gsub("^.+ ", "", dat$GMT_Time) ), mean)
new2
     Group.1 Group.2 CollarID ActivityX ActivityY ActivitySum     Temp
1 2000-01-01      AM     2905   0.00000         0     0.00000 21.66667
2 2011-02-21      PM     2905  10.66667        12    22.66667 20.66667

The gsub call is removing any character between the beginning of the string and the last instance of a space. Your second request might be best accomplished by adding ID variables to the by list.

> new <- aggregate(dat[unlist(lapply(dat, is.numeric))],
                   by = list(Date=date, 
                             AMPM= gsub("^.+ ", "", dat$GMT_Time),
                             Type=dat$Type ), mean)
> new
        Date AMPM  Type CollarID ActivityX ActivityY ActivitySum     Temp
1 2000-01-01   AM Argos     2905   0.00000         0     0.00000 21.66667
2 2011-02-21   PM Argos     2905  10.66667        12    22.66667 20.66667

Good thoughts on adding ID variables to the by list! That worked great. — B. Davis, Sep 14 '13 at 20:46
As long as you want all of the "ID variables" to be in separate strata then adding them to the by list is the right way to go. — IRTFM, Sep 14 '13 at 23:46

Trying to aggregate (and average) time series data (collected every five minutes) into day and night periods for each 24 hr period in R

1 Answers1

Linked