0

I am trying to merge two datasets which have irregular timestamp frequencies. I have followed an example in another post to try make this work but still it wont merge. The example I have tried to follow is adding data from an irregular time series to a timeseries with 5-min timesteps.

Data set 1

Words <- as.character(c("2016-08-30 15:04:51.97", "2016-08-30 15:04:53.70",
                           "2016-08-30 15:04:54.26", "2016-08-30 15:04:56.00",
                           "2016-08-30 15:04:56.55", "2016-08-30 15:04:58.29",
                           "2016-08-30 15:04:58.85", "2016-08-30 15:05:00.59",
                           "2016-08-30 15:05:01.15", "2016-08-30 15:05:02.89",
                           "2016-08-30 15:05:03.45", "2016-08-30 15:05:05.19",
                           "2016-08-30 15:05:05.75", "2016-08-30 15:05:07.49",
                           "2016-08-30 15:05:08.04"))
    op <- options(digits.secs = 2)
    op
    Date <- as.POSIXct(Words)
    TagID <- rep(2297.2, 15)
    Xaxis <- as.numeric(c(13.738267, 13.76611, 13.728986, 13.70624, 13.722799, 
                          13.696131, 13.707635, 13.683349, 13.688462, 13.690102,
                          13.67994, 13.680669, 13.684442, 13.676477, 13.678154))
    Yaxis <- as.numeric(c(14.670887, 14.630401, 14.684383, 14.68586, 14.69338,
                          14.686517, 14.694365, 14.677797, 14.681285, 14.687439, 
                          14.675471, 14.678207, 14.681899, 14.674103, 14.675745))
    Zaxis <- as.numeric(c(10.106183, 10.198599, 10.075378, 10.057535, 10.054841,
                          10.049604, 10.042946, 10.057003, 10.054044, 10.043906, 
                          10.058976, 10.054471, 10.050245, 10.059166, 10.057288))
    Data1 <- data.frame(Date, TagID, Xaxis, Yaxis, Zaxis)

Dataset 2

Words2 <- as.character(c("2016-08-30 15:05:01.55", "2016-08-30 15:10:01.56"))
Date <- as.POSIXct(Words2)
Speed <- c(0.385031168, 0.389179907)
Direction <- c(239.5721794,229.063366)
Data2 <- data.frame(Date, Speed, Direction)

The merged datasets should look like this:

#    Date                     TagID   Xaxis     Yaxis     Zaxis      Speed     Direction
    # 1: 2016-08-30 15:04:51.97   2297.2  13.73827  14.67089  10.10618     NA        NA
    # 2: 2016-08-30 15:04:53.70   2297.2  13.76611  14.63040  10.19860     NA        NA
    # 3: 2016-08-30 15:04:54.25   2297.2  13.72899  14.68438  10.07538     NA        NA
    # 4: 2016-08-30 15:04:56.00   2297.2  13.70624  14.68586  10.05753     NA        NA
    # 5: 2016-08-30 15:04:56.54   2297.2  13.72280  14.69338  10.05484     NA        NA
    # 6: 2016-08-30 15:04:58.28   2297.2  13.69613  14.68652  10.04960     NA        NA
    # 7: 2016-08-30 15:04:58.84   2297.2  13.70763  14.69436  10.04295     NA        NA
    # 8: 2016-08-30 15:05:00.58   2297.2  13.68335  14.67780  10.05700     NA        NA
    # 9: 2016-08-30 15:05:01.15   2297.2  13.68846  14.68129  10.05404  0.385031  239.5722    
    # 10: 2016-08-30 15:05:02.89  2297.2  13.69010  14.68744  10.04391     NA        NA
    # 11: 2016-08-30 15:05:03.45  2297.2  13.67994  14.67547  10.05898     NA        NA
    # 12: 2016-08-30 15:05:05.19  2297.2  13.68067  14.67821  10.05447     NA        NA
    # 13: 2016-08-30 15:05:05.75  2297.2  13.68444  14.68190  10.05025     NA        NA
    # 14: 2016-08-30 15:05:07.49  2297.2  13.67648  14.67410  10.05917     NA        NA
    # 15: 2016-08-30 15:05:08.03  2297.2  13.67815  14.67574  10.05729     NA        NA

Convert the dataframes to datatables and merge them together:

#Merge datasets
library(data.table)
Data1.dt <- data.table(Data1, key="Date")[,Date2:=Date]
Data2.dt <- data.table(Data2)
NewData <- Data1.dt[Data2.dt, list(Date=Date2, Speed, Direction), roll=-Inf][
  Data1.dt, list(Date, TagID, Xaxis, Yaxis, Zaxis, Speed, Direction)]
#does not work - error message but this worked in the other example

#Try again
Data1.dt <- data.table(Data1, key="Date")
Data2.dt <- data.table(Data2)
NewData2 <- Data1.dt[Data2.dt, on="Date", list(Date, Speed, Direction), roll=-Inf][
  Data1.dt, list(Date, TagID, Xaxis, Yaxis, Zaxis, Speed, Direction)]
#Merges but does not carry the data with it

What am I missing to make it merge the two datasets and carry the data with it? Note that the person that posted the original example had a similar problem but was down to a version issue - I am using a later version so it shouldn't be a problem.

R version 3.2.4 Revised (2016-03-16 r70336) -- "Very Secure Dishes"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

data.table version 1.10.4

Thanks in advance for your help.

Community
  • 1
  • 1
Zoe
  • 1
  • 2
  • You probably want to review the guidance here: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250 Your example is not reproducible nor (I guess) minimal. – Frank Apr 05 '17 at 17:04
  • 1
    @Frank - thanks, I have edited the post to improve it & provide data to work with. – Zoe Apr 05 '17 at 18:07
  • Ok, thanks. To hit row 9, you can apply setDT to the tables and look at `Data1[Data2, on=.(Date), roll=Inf, which=TRUE]`. This shows that row 15 would be matched by the second row of Data2, which makes sense since it's rolling down to a lower value. To limit the range of a roll, use some finite value instead of Inf, like `Data1[Data2, on=.(Date), roll=5, \`:=\`(Speed = i.Speed, Direction = i.Direction)][]` Not sure if that answers it. Note that this add cols to Data1 instead of making a new table. – Frank Apr 05 '17 at 18:23
  • 1
    Does your data have any sort of regular interval at which data should have been collected? Matching is going to be hard unless you have a pattern to rely on. (Should 5.5 seconds be matched with 6 seconds or 5 seconds?) – Bishops_Guest Apr 05 '17 at 18:25
  • @Frank - that works. Thank you so much, this has been puzzling me for too long. I will apply it the larger dataset and see that it holds true. – Zoe Apr 05 '17 at 19:00
  • Cool. You can post what you end up using as an answer (though there may be some mandatory delay before doing so). – Frank Apr 05 '17 at 19:01
  • @Bishops_Guest - there are regular intervals but they were hard to specify with the decimal second and rounding issues. The above adjustment seems to work on a nearest value basis and I'm happy with that. Thanks though. – Zoe Apr 05 '17 at 19:04

0 Answers0