Suppose I have two dataframes,
df1
id time1
1 2016-04-07 21:39:10
1 2016-04-05 11:19:17
2 2016-04-03 10:58:25
2 2016-04-02 21:39:10
df2
id time2
1 2016-04-07 21:39:11
1 2016-04-05 11:19:18
1 2016-04-06 21:39:11
1 2016-04-04 11:19:18
2 2016-04-03 10:58:26
2 2016-04-02 21:39:11
2 2016-04-04 10:58:26
2 2016-04-05 21:39:11
I want to find for each entry in df1, the shortest time difference in df2. Suppose we take the first entry, it has id 1, so I want to loop through df2, filter for id 1, then check the time difference between one entry of df1 and remaining entries of df2 and find the shortest difference and fetch the corresponding entry. My sample output should be
id time time2 diff(in secs)
1 2016-04-07 21:39:10 2016-04-07 21:39:10 1
1 2016-04-05 11:19:17 2016-04-05 11:19:17 1
2 2016-04-03 10:58:25 2016-04-03 10:58:25 1
2 2016-04-02 21:39:10 2016-04-02 21:39:10 1
the following is my try,
for(i in unique(df1$id)){
temp1 = df1[df1$id == i,]
temp2 = df2[df2$id == i,]
for(j in unique(df1$time1){
for(k in unique(df2$time2){
diff = abs(df1$time1[j] - df2$time2[k]
print(diff)}}}
I am not able to progress after this, getting many errors. Can anybody help me in correcting this? May be suggest a better efficient way to do this? Any help would be appreciated.
Update:
Reproducable data:
df1 <- data.frame(
id = c(1,1,2,2),
time1 = c('2016-04-07 21:39:10', '2016-04-05 11:19:17', '2016-04-03 10:58:25', '2016-04-02 21:39:10')
)
df2 <- data.frame(
id = c(1,1,1,1,2,2,2,2),
time2 = c('2016-04-07 21:39:11', '2016-04-05 11:19:18','2016-04-07 21:39:11', '2016-04-05 11:19:18', '2016-04-03 10:58:26', '2016-04-02 21:39:11','2016-04-03 10:58:26', '2016-04-02 21:39:11')
)
df1$time1 = as.POSIXct(df1$time1)
df2$time2 = as.POSIXct(df2$time2)