-3

This question has already been asked before here. However, the situation presented in the question and the proposed solution only work if the dataframes consist of just two variables, one of which is the timestamp.

I am surprised because this solution is not at all generalizable and it only tackles one very specific case.

So if we had a dataframe df1 that looks like this:

Timestamp       Var1 Var2 ... Var850
01-01-20 10:47  7    8        5
01-01-20 11:50  6    4        3

And a dataframe df2 that looks like this:

Timestamp       Var851 Var852 ... Var2992
01-01-20 10:55  4    1            1
01-01-20 12:08  3    4            6

How would we merge them based on the closest timestamp?

J. Doe
  • 1,544
  • 1
  • 11
  • 26
  • There isn't any reason why the answer you found wouldn't work in your case. – BrianLang Sep 09 '20 at 06:56
  • I think the answer there should work for your case as well. You just don't assign `val2 :=`. Provided your `Timestamp` is of class `POSIXct`. Did you try that? What output did you get? – Ronak Shah Sep 09 '20 at 07:02

1 Answers1

2

Enter the world of data.table's rolling joins

sample data

#or use
#    setDT(df1); setDT(df2)
#to convert existing data.frame df1 and df2 to data.table


library( data.table)
df1 <- data.table::fread("Timestamp       Var1 Var2 
01-01-20T10:47  7    8        
01-01-20T11:50  6    4")        

df2 <- data.table::fread("Timestamp       Var851 Var852
01-01-20T10:55  4    1
01-01-20T12:08  3    4")

#timestamps/dates have to be of posix- or date-class to be able 
#to roll-join them
df1[, Timestamp := as.POSIXct( Timestamp, format = "%d-%m-%yT%H:%M")]
df2[, Timestamp := as.POSIXct( Timestamp, format = "%d-%m-%yT%H:%M")]

code

df2[df1, roll = "nearest", on = .(Timestamp)]

#              Timestamp Var851 Var852 Var1 Var2
# 1: 2020-01-01 10:47:00      4      1    7    8
# 2: 2020-01-01 11:50:00      3      4    6    4
Wimpel
  • 26,031
  • 1
  • 20
  • 37