I have two dataframes, the first one:
KEY DATE VALUE
A 01/01/2020 1
A 05/01/2020 3
B 01/02/2020 1
C 05/02/2020 1
The second dataframe:
KEY START_DATE END_DATE VALUE2
A 01/01/2020 03/01/2020 3
B 01/03/2020 06/03/2020 4
C 01/02/2020 07/02/2020 5
I'd like to merge these two by KEY but also with the condition that DATE from the first dataframe is included in the window START_DATE - END_DATE in the second dataframe.
Expected result:
KEY DATE START_DATE END_DATE VALUE VALUE2
A 01/01/2020 01/01/2020 03/01/2020 1 3
A 05/01/2020 NAT NAT 1 NAN
B 01/02/2020 NAT NAT 1 NAN
C 05/02/2020 01/02/2020 07/01/2020 1 5
I tried this, but this method can be very memory consuming if the KEY has multiple entries in the second dataframe:
df = df1.merge(df2, on="KEY", how="left")
idx = (df.DATE >= df.START_DATE) & (df.DATE <= df.END_DATE)
df = df[idx]