I have some data from animal tracking and I would like to transform them in order to use a model. The transmitters sent the position every hour but sometimes I have no data for several hours. The model I will use after assumes that the points are equally spaced in time.
I would like to find a way to start from the first hour of the data and select the value 4 hours later and so on. If there is no value at that time : 1) look if there is a value one (or 2) hours earlier or later and take it (in order to avoid too much NA, see below). If there is still nothing : 2) Include a row at the needed time with a NA for co-variable and restart from that point again and again until the 5000 rows are checked.
As I don't know the time step used by the animals I would like to be able to try different value (for exemple every 6 hours + or - 2h).
Here is a sample of my data :
tab <- data.frame(Date = c("2015-04-27 02:28:00","2015-04-27 03:11:00","2015-04-27 05:16:00","2015-04-27 09:22:00","2015-04-27 10:10:00","2015-04-27 16:14:00","2015-04-27 17:29:00"),
ID = c("DD1","DD1","DD1","DD1","DD1","DD1","DD1"),
covar= c(1,2,3,4,5,6,7))
>tab
Date ID covar
2015-04-27 02:28:00 DD1 1
2015-04-27 03:11:00 DD1 2
2015-04-27 05:16:00 DD1 3
2015-04-27 09:22:00 DD1 4
2015-04-27 10:10:00 DD1 5
2015-04-27 16:14:00 DD1 6
2015-04-27 17:29:00 DD1 7
And here is what I would like to obtain :
>regTab4h
Date ID covar
2015-04-27 02:28:00 DD1 1 # keep this one at it is the first
# Drop this one (less than 4h interval)
2015-04-27 05:16:00 DD1 3 # Normal time would be 06 hours but 05h is in the 1h interval
2015-04-27 09:22:00 DD1 4 # 5 + 4 = 9 so this is perfect keep this one
# No longer needed
2015-04-27 13:00:00 NA NA # Create a NA here because there is no value at the time 13:00:00 + or - 1h
# drop 16:00:00 because 13+4 = 17 and we have it
2015-04-27 17:29:00 DD1 7
I tried to follow the method in this post : Creating regular 15-minute time-series from irregular time-series However it does not allow to have "NA" when there is big gap
To simplify the data I round them to the nearest hour and transform them in an xts object.
tab$Date=round.POSIXt(as.POSIXct(as.character(tab$Date)),units="hours")
x <- xts(tab[,-1],order.by =as.POSIXct((tab$Date)))
And then I unsuccessfully tried :
library(hydroTSM)
x2=to.period(x,period = "hours", k=4, OHLC = FALSE)
x3=as.zoo(x2)
x4=izoo2rzoo(x3, from= start(x), to= end(x), date.fmt= "%Y-%m-%d %H:%M:%S", tstep= "hours",k=4)
Any help would be much appreciated !
I hope I didn't missed an evident way to do it ... I suppose that a loop could do the job, but I have no idea how to formulate it.