0

I have some data from animal tracking and I would like to transform them in order to use a model. The transmitters sent the position every hour but sometimes I have no data for several hours. The model I will use after assumes that the points are equally spaced in time.

I would like to find a way to start from the first hour of the data and select the value 4 hours later and so on. If there is no value at that time : 1) look if there is a value one (or 2) hours earlier or later and take it (in order to avoid too much NA, see below). If there is still nothing : 2) Include a row at the needed time with a NA for co-variable and restart from that point again and again until the 5000 rows are checked.

As I don't know the time step used by the animals I would like to be able to try different value (for exemple every 6 hours + or - 2h).

Here is a sample of my data :

tab <- data.frame(Date = c("2015-04-27 02:28:00","2015-04-27 03:11:00","2015-04-27 05:16:00","2015-04-27 09:22:00","2015-04-27 10:10:00","2015-04-27 16:14:00","2015-04-27 17:29:00"),
                  ID = c("DD1","DD1","DD1","DD1","DD1","DD1","DD1"), 
                  covar= c(1,2,3,4,5,6,7))

>tab
   Date            ID    covar
2015-04-27 02:28:00 DD1     1
2015-04-27 03:11:00 DD1     2
2015-04-27 05:16:00 DD1     3
2015-04-27 09:22:00 DD1     4
2015-04-27 10:10:00 DD1     5
2015-04-27 16:14:00 DD1     6
2015-04-27 17:29:00 DD1     7

And here is what I would like to obtain :

>regTab4h
       Date            ID    covar
    2015-04-27 02:28:00 DD1     1      # keep this one at it is the first
                                       # Drop this one (less than 4h interval)
    2015-04-27 05:16:00 DD1     3      # Normal time would be 06 hours but 05h is in the 1h interval 
    2015-04-27 09:22:00 DD1     4      # 5 + 4 = 9 so this is perfect keep this one
                                       # No longer needed 
    2015-04-27 13:00:00 NA     NA      # Create a NA here because there is no value at the time 13:00:00 + or - 1h 
                                       # drop 16:00:00 because 13+4 = 17 and we have it
    2015-04-27 17:29:00 DD1     7

I tried to follow the method in this post : Creating regular 15-minute time-series from irregular time-series However it does not allow to have "NA" when there is big gap

To simplify the data I round them to the nearest hour and transform them in an xts object.

tab$Date=round.POSIXt(as.POSIXct(as.character(tab$Date)),units="hours")
x <- xts(tab[,-1],order.by =as.POSIXct((tab$Date))) 

And then I unsuccessfully tried :

library(hydroTSM)
x2=to.period(x,period = "hours", k=4, OHLC = FALSE)
x3=as.zoo(x2)
x4=izoo2rzoo(x3, from= start(x), to= end(x), date.fmt= "%Y-%m-%d %H:%M:%S", tstep= "hours",k=4)

Any help would be much appreciated !

I hope I didn't missed an evident way to do it ... I suppose that a loop could do the job, but I have no idea how to formulate it.

Community
  • 1
  • 1
Jeff972
  • 3
  • 3
  • What is `izoo2rzoo` function? –  Nov 09 '15 at 03:18
  • It is supposed to transformed an irregular zoo object (with non existing value for some dates) to a regularly spaced zoo object, filling the missing dates with "NA". It is from {hydroTSM} package. But I did not find a way to define the "k=4" value and this does not help to take the nearest value (+ or - 1h) – Jeff972 Nov 09 '15 at 03:28
  • "...it does not allow to have `NA` when there is a big gap." Yes, it does; and there's an example in the answer. In your case it would be: `merge(x, xts(character(), seq(start(x), end(x), by="hours")))`. – Joshua Ulrich Nov 09 '15 at 03:47
  • Sorry, I have not said it correctly, following this method, I would have to merge with an other tab with 4h time step to have what I'm looking for. If I stop at that step I have the NA, but I can't take the nearest value (in a 1hour period around the date). Thus I end up with a lot of NA. And if I use the next step of this method and use na.locf I have no more NA and repeated observation when there is a gap > 8h. So in order to keep the NA but not too much I would like it to be not strictly 4 hours but 4H + or - 1h but then go 4hours from this one. Not easy to explain sorry if I am not clear. – Jeff972 Nov 09 '15 at 04:29

0 Answers0