I have two datasets:
df1 and df2, where I would like the columns to merge together from df1, if the datetimes are within 20 seconds of df2
df1
Connect Ended
4/6/2020 1:15:21 PM 4/6/2020 2:05:18 PM
3/31/2020 11:00:08 AM 3/31/2020 11:00:10 AM
4/1/2020 10:00:05 PM 4/1/2020 12:00:05 PM
df2
Start End
4/6/2020 1:15:21 PM 4/6/2020 2:05:18 PM
3/31/2020 11:00:10 AM 3/31/2020 11:00:14 AM
Desired Output:
df3
Match_Start1 Match_End1 Match_Start2 Match_End2
4/6/2020 1:15:21 PM 4/6/2020 2:05:18 PM 4/6/2020 1:15:21 PM 4/6/2020 2:05:18 PM
3/31/2020 11:00:08 AM 3/31/2020 11:00:10 AM 3/31/2020 11:00:10 AM 3/31/2020 11:00:14 AM
df4 (unmatched)
Unmatched_Start Unmatched_end
4/1/2020 10:00:05 PM 4/1/2020 12:00:05 PM
Dput:
df1
:
' Connect Ended\n0 4/6/2020 1:15:21 PM 4/6/2020 2:05:18 PM\n1
3/31/2020 11:08:08 AM 3/31/2020 11:00:10 AM\n2 4/1/2020 10:00:05 PM 4/1/2020 12:00:05 PM'
df2
Out[117]:
' Start End\n0 4/6/2020 1:15:21 PM 4/6/2020 2:05:18 PM\n1
3/31/2020 11:08:08 AM 3/31/2020 11:00:14 AM\n2 NaN NaN'
What I am thinking:
df2 = pd.merge_asof(df1, df2, on="Connect", by = "Ended",
tolerance=pd.Timedelta('20s'), direction='backward')
However, how do I incorporate the condition of the 20 seconds, as well as show the unmatched dataset?
Any suggestion is appreciated