3

Is it possible to subset an xts series for both date- and time-ranges in one go? E.g., in the series below I want to pick rows only for 6:30-6:50 and only for the 01-04 of a month (or better yet, the first 4 data dates of a month but that's an unrelated question).

spy[,ohlcv]
                         open    high     low   close  volume
2016-05-19 06:30:00   204.030 204.300 203.900 204.100  537530
2016-05-19 06:35:00   204.100 204.340 204.010 204.240  482436
2016-05-19 06:40:00   204.250 204.540 204.240 204.530  441800
...
2016-05-20 06:30:00   204.960 205.250 204.860 205.170  564441
2016-05-20 06:35:00   205.170 205.410 205.170 205.250  593626
2016-05-20 06:40:00   205.260 205.440 205.240 205.350  342840
...

I saw some answers to range selection here and here which are very helpful but do not show setting multiple concurrent constraints on the index - which would be way more readable. Currently I am managing this via longhand

(temp1 <- as.character(index(spy), format="%H:%M")) >= "06:30" & temp1 <= "06:50" -> set1
as.character(index(spy), format="%d") < "05" -> set2
then, spy[set1 & set2, ]
Community
  • 1
  • 1
Dinesh
  • 4,437
  • 5
  • 40
  • 77
  • 2
    I suspect something like `spy[.indexmon(spy) %in% c(0,1,2,3) & .indexhour(spy) == 6 & .indexmin(spy) %in% (30:50)]` should work for you. – Mike H. Aug 22 '16 at 20:45
  • yep! I just changed .indexmon() to .indexmday() and it works, thanks – Dinesh Aug 22 '16 at 23:59

2 Answers2

2

There's no way to combine the two functionalities. One thing you can do is use split to break the data into monthly chunks, then use first to get the first 4 observations of each month. Then you can use the time-of-day subsetting to get a specific time interval.

require(xts)
times <- seq(as.POSIXct("2016-05-19 04:30:00"),
             as.POSIXct("2016-05-20 07:40:00"), by="5 min")
set.seed(21)
x <- xts(rnorm(length(times)), times)
# get the first 4 days' observations for each month
y <- do.call(rbind, lapply(split(x, "months"), first, n="4 days"))
# subset by time interval
y["T06:30:00/T06:50:00"]
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • this is surely more elegant than the complicated set algebra I got, albeit at the cost of some memory. Is it possible to extend xts indexing? – Dinesh Aug 23 '16 at 23:37
  • @Dinesh: It's certainly possible. xts was designed as an eXtensible Time Series. Create a subclass and an indexing method that does what you want. – Joshua Ulrich Aug 24 '16 at 00:29
1

Per my comment before, you could try something like:

spy[.indexmday(spy) %in% c(0,1,2,3) & .indexhour(spy) == 6 & .indexmin(spy) %in% (30:50)]

Basically we are just subsetting to the months/hours/minutes we want.

Mike H.
  • 13,960
  • 2
  • 29
  • 39
  • Yes, it worked out (thanks, again!), and it's better than my original longhand, but I am going to hold out for a simpler way of doing this. Why? I fear this will get progressively harder if I had a "complex" time window of, say, 6:30 to 7:15. Could be I am missing something? – Dinesh Aug 23 '16 at 21:40