Suppose I have two data frame, df1 and df2.
df1 <- data.frame(value = 1:5, timestamp = as.POSIXct( c( "2020-03-02 12:20:00", "2020-03-02 12:20:01", "2020-03-02 12:20:03" , "2020-03-02 12:20:05", "2020-03-02 12:20:08")))
df2 <- data.frame(value = 6:10, timestamp = as.POSIXct( c( "2020-03-02 12:20:01", "2020-03-02 12:20:02", "2020-03-02 12:20:03" , "2020-03-02 12:20:04", "2020-03-02 12:20:05")))
df1
value | timestamp |
---|---|
1 | 2020-03-02 12:20:00 |
2 | 2020-03-02 12:20:01 |
3 | 2020-03-02 12:20:03 |
4 | 2020-03-02 12:20:05 |
5 | 2020-03-02 12:20:08 |
df2
value | timestamp |
---|---|
6 | 2020-03-02 12:20:01 |
7 | 2020-03-02 12:20:02 |
8 | 2020-03-02 12:20:03 |
9 | 2020-03-02 12:20:04 |
10 | 2020-03-02 12:20:05 |
Now, I want to keep df1, and left join with df2 by timestamp, since the timestamp is not exactly the same, what I want to do is:
- If there is an exact match, then just left join the value from df2
- If there is not an exact match, then try to match with the latest timestamp, and left join that value
- If there is not a match (no latest timestamp), then return NA
Therefore, my expect output would be like this
data.frame(df1, value.df2 = c(NA, 6, 8, 10, 10))
value | timestamp | value.df2 |
---|---|---|
1 | 2020-03-02 12:20:00 | NA |
2 | 2020-03-02 12:20:01 | 6 |
3 | 2020-03-02 12:20:03 | 8 |
4 | 2020-03-02 12:20:05 | 10 |
5 | 2020-03-02 12:20:08 | 10 |
I hope I could do this by tidyverse or data.table.