My question is an extension of this question:
Check if value in a dataframe is between two values in another dataframe
df1
df1_Col df1_start
0 A1 1200
1 B2 4000
2 B2 2500
df2
df2_Col df2_start df2_end data
0 A1 1000 2000 DATA_A1
1 A1 900 1500 DATA_A1_A1
**2 A1 2000 3000 DATA_A1_A1_A1**
2 B1 2000 3000 DATA_B1
3 B2 2000 3000 DATA_B2
output:
df1_Col df1_start data
0 A1 1200 DATA_A1;DATA_A1_A1
1 B2 4000
2 B2 2500 DATA_B2
I am comparing the value of df1_Col
to match with df2_Col
and df1_start
to be within the range of df2_start
and df2_end
, then add values of data
column in df1
. If there multiple matches, then data
can combine with any delimiter like ';'.
The code is as follows:
for v,ch in zip(df1.df1_start, df1.df1_Col):
df3 = df2[(df2['df2_start'] < v) & (df2['df2_end'] > v) & (df2['df2_Col'] ==ch)]
data = df3['data']
df1['data'] = data
Loops are used because file is huge.
Looking forward for your assistance.