17

I have a data frame that looks like that

            date_time loc_id node  energy   kgco2 
1 2009-02-27 00:11:08     87  103 0.00000 0.00000 
2 2009-02-27 01:05:05     87  103 7.00000 3.75900 
3 2009-02-27 02:05:05     87  103 6.40039 3.43701 
4 2009-02-27 03:05:05     87  103 4.79883 2.57697 
5 2009-02-27 04:05:05     87  103 4.10156 2.20254 
6 2009-02-27 05:05:05     87  103 2.59961 1.39599

Is there anyway I can subset it according to range of time, for example, 2am to 5am. I should then get a result that looks like this:

            date_time loc_id node  energy   kgco2  
3 2009-02-27 02:05:05     87  103 6.40039 3.43701 
4 2009-02-27 03:05:05     87  103 4.79883 2.57697 
5 2009-02-27 04:05:05     87  103 4.10156 2.20254 
lmo
  • 37,904
  • 9
  • 56
  • 69
Wet Feet
  • 4,435
  • 10
  • 28
  • 41

4 Answers4

17

One way to do it is to use lubridate and define an interval :

library(lubridate)

date1 <- as.POSIXct("2009-02-27 02:00:00")
date2 <- as.POSIXct("2009-02-27 05:00:00")
int <- new_interval(date1, date2)

df[df$datetime %within% int,]
juba
  • 47,631
  • 14
  • 113
  • 118
  • 3
    just to update this great answer, now `new_interval()` is deprecated. Use `interval()` instead. – BSP Apr 06 '17 at 13:19
16

I'd use the lubridate package and the hour() function to make your life easier...

require( lubridate )

with( df , df[ hour( date_time ) >= 2 & hour( date_time ) < 5 , ] )

#            date_time loc_id node  energy   kgco2
#3 2009-02-27 02:05:05     87  103 6.40039 3.43701
#4 2009-02-27 03:05:05     87  103 4.79883 2.57697
#5 2009-02-27 04:05:05     87  103 4.10156 2.20254
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184
6

I would suggest using xts package for time series analysis. It has very convenient subsetting functions.

DF
##             date_time loc_id node  energy   kgco2
## 1 2009-02-27 00:11:08     87  103 0.00000 0.00000
## 2 2009-02-27 01:05:05     87  103 7.00000 3.75900
## 3 2009-02-27 02:05:05     87  103 6.40039 3.43701
## 4 2009-02-27 03:05:05     87  103 4.79883 2.57697
## 5 2009-02-27 04:05:05     87  103 4.10156 2.20254
## 6 2009-02-27 05:05:05     87  103 2.59961 1.39599

require(xts)
XTSDATA <- xts(DF[, -1], DF[, 1])
XTSDATA["T02:00:00/T05:00:00"]
##                     loc_id node  energy   kgco2
## 2009-02-27 02:05:05     87  103 6.40039 3.43701
## 2009-02-27 03:05:05     87  103 4.79883 2.57697
## 2009-02-27 04:05:05     87  103 4.10156 2.20254
CHP
  • 16,981
  • 4
  • 38
  • 57
1
  • Use the lubridate::hours function to extract the hour number.
  • Then use dplyr::filter function to get the result.
Jiaxiang
  • 865
  • 12
  • 23