3

A simple question: I know how to subset time series in xts for years, months and days from the help: x['2000-05/2001'] and so on.

But how can I subset my data by hours of the day? I would like to get all data between 07:00 am and 06:00 pm. I.e., I want to extract the data during business time - irrelevant of the day (I take care for weekends later on). Help has an example of the form:

.parseISO8601('T08:30/T15:00')

But this does not work in my case. Does anybody have a clue?

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
Richi W
  • 3,534
  • 4
  • 20
  • 39
  • 1
    Can you please give a reproducible example? – agstudy Dec 17 '12 at 11:29
  • 1
    If your `xts` object is called `x` then something like `y <- x["T09:30/T11:00"]` works for me to get a slice of the morning session, for example. – SlowLearner Dec 17 '12 at 11:31
  • @agstudy sample.time = timeDate('2012-01-01 00:00:00')+15*60*(1:500) data = 1:500 data.ts = xts(data,order.by=sample.time) data.ts["T09:30/T11:00"] – Richi W Dec 17 '12 at 12:13
  • @SlowLearner You are right .. it simply works ... I was confused because I just used it on the time index and not on the xts object. Aplied to the object it simply works. data.ts["T09:30/T11:00"] works, but sample.time["T09:30/T11:00"] does not. – Richi W Dec 17 '12 at 12:15
  • @SlowLearner ... I would accept, if your comment were an answer. ... – Richi W Dec 17 '12 at 12:16

2 Answers2

8

If your xts object is called x then something like y <- x["T09:30/T11:00"] works for me to get a slice of the morning session, for example.

SlowLearner
  • 7,907
  • 11
  • 49
  • 80
4

For some reason to cut xts time of day using x["T09:30/T11:00"] is pretty slow, I use the method from R: Efficiently subsetting dataframe based on time of day and data.table time subset vs xts time subset to make a faster function with similar syntax:

cut_time_of_day <- function(x, t_str_begin, t_str_end){

    tstr_to_sec <- function(t_str){
        #"09:00:00" to sec of day
        as.numeric(as.POSIXct(paste("1970-01-01", t_str), "UTC")) %% (24*60*60)
    }

    #POSIX ignores leap second
    #sec_of_day = as.numeric(index(x)) %% (24*60*60)                                #GMT only
    sec_of_day = {lt = as.POSIXlt(index(x)); lt$hour *60*60 + lt$min*60 + lt$sec}   #handle tzone
    sec_begin  = tstr_to_sec(t_str_begin)
    sec_end    = tstr_to_sec(t_str_end)

    return(x[ sec_of_day >= sec_begin & sec_of_day <= sec_end,])
}

Test:

n = 100000
dtime <- seq(ISOdate(2001,1,1), by = 60*60, length.out = n)
attributes(dtime)$tzone <- "CET"
x = xts((1:n), order.by = dtime)

y2 <- cut_time_of_day(x,"07:00:00", "09:00:00")
y1 <- x["T07:00:00/T09:00:00"]

identical(y1,y2)
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
user3226167
  • 3,131
  • 2
  • 30
  • 34
  • 1
    Thanks for the demonstration of how much faster xts' time-of-day subsetting can be! I've [created an issue](https://github.com/joshuaulrich/xts/issues/193) to work on improving it. – Joshua Ulrich Jun 10 '17 at 16:26
  • for those who are wondering ... the custom function above is still about 1000x faster than the internal xts function. – ricardo Mar 30 '18 at 06:50
  • @ricardo: I get results that are hundreds of times faster, but not close to 1000x. Can you share your benchmark and `sessionInfo()` output? Feel free to email it to me. – Joshua Ulrich May 29 '18 at 18:23
  • In case anyone's still wondering the performance of both, I can confirm `xts` subsetting by time is about as fast, as of `0.13.0`; `microbenchmark` running 100 times of both (custom function vs xts subsetting) yields very close results, mean being 46.288 milisecs vs 48.521 milisecs, max being 57.142 milisecs vs 185 milisecs. – stucash Feb 27 '23 at 16:06