0

I have a data file that needs to averaged.

data<-data.frame(
    Data=seq(
        from=as.POSIXct("2014-04-01 00:00:00"), 
        to=as.POSIXct("2014-04-03 00:00:00"), 
        by ="5 min"
    ),
    value=rnorm(577,0,1)
)

I need to find the average of "value" from 05:00:00 to 17:00:00 and then 17:00:00 to 05:00:00 (of the following day). e.g. from 2014-04-01 05:00:00 to 2014-04-01 17:00:00 and from 2014-04-01 17:00:00 to 2014-04-02 05:00:00

The real data is not continuous and is missing several intervals. I can do it for the same day, but I don't know how to include the time from the following day.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Bhante
  • 79
  • 1
  • 6

2 Answers2

2

Here's one strategy. You can use the cut.POSIXt and the seq.POSIXt to create an interval factor and then use that to take the means of the different intervals.

intervals<-cut(
    data$Data, 
    breaks=seq(
        as.POSIXct("2014-03-31 17:00:00"), 
        as.POSIXct("2014-04-03 5:00:00"), 
        by="12 hours"
    )
)

means<-tapply(data$value, intervals, mean)

as.data.frame(means)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
1

Here is a way:

day <- data[as.numeric(strftime(data$Data,"%H")) > 5 & 
            as.numeric(strftime(data$Data,"%H")) < 17,]

night <- data[as.numeric(strftime(data$Data,"%H")) < 5 | 
              as.numeric(strftime(data$Data,"%H")) > 17,]

strftime returns a character vector, which is why it is nested inside as.numeric here. From there it is just indexing.

Jota
  • 17,281
  • 7
  • 63
  • 93