Currently trying to write some that would return the last date from an ordered list that is less than date X.
Right now I have this: it gets a list of days, and gets an index off the day we're going to be doing search on and range of how many dates we want to go back.
After that it checks if the date exists or not (e.g. Feb 30th). If the date doesn't exist, it decreases the date by 1 and then applies filter again (otherwise it tries to subtract 1 day from NA
and fails).
library(lubridate)
getDate <- function(dates,day,range){
if(range == 'single')
{return (day-1)}
z <- switch(range,
single = days(1),
month = days(30),
month3 = months(3),
month6 = months(6),
year = years(1)
)
new_day <-(dates[day]-z)
i <- 1
while (is.na(new_day)){
new_day <- dates[day] - days(i) - z
}
ind<-which.min(abs (diff <-(new_day-dates)))
if (diff[ind] < 0)
{ind <- ind -1}
return (ind[1])
}
While this function works, the problem is the speed efficiency. I have a feeling that which.min(abs())
is far from the quickest and I'm wondering if there are any better alternatives (outside of also writing my own function to search lists).
stocks <- list(structure(list(sec = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), min = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), hour = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L), mday = c(2L, 3L, 4L, 7L, 8L, 9L, 10L, 11L, 14L, 15L, 16L, 17L,
18L, 22L, 23L, 24L, 25L, 28L, 29L, 30L, 31L, 1L, 4L, 5L, 6L), mon = c(0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
1L, 1L, 1L), year = c(108L, 108L, 108L, 108L, 108L, 108L, 108L, 108L, 108L,
108L, 108L, 108L, 108L, 108L, 108L, 108L, 108L, 108L, 108L, 108L, 108L, 108L,
108L, 108L, 108L), wday = c(3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L,
2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L), yday = c(1L, 2L, 3L, 6L, 7L,
8L, 9L, 10L, 13L, 14L, 15L, 16L, 17L, 21L, 22L, 23L, 24L, 27L, 28L, 29L, 30L,
31L, 34L, 35L, 36L), isdst = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("sec", "min",
"hour", "mday", "mon", "year", "wday", "yday", "isdst"), tzone = "UTC",
class = c("POSIXlt", "POSIXt")))
old_pos <- getDate(stocks[[1]],21,"month") #should return 0
old_pos <- getDate(stocks[[1]],22,"month") #should return 1
This does not return a vector, nor a date, only an index and the main question isn't about working (which it does), but optimizing it.
The value is later on being used in another function, one possible speed up is to first match all of the old indexes to new ones and then return that as another list. However not sure if it would offer any speed up.