I am trying to search each row within a column which contains a longer string in one dataframe (Reason from df1) for a specific string from another dataframe (Invoice ID from df2) and then merge the two dataframes to get "Amount" in the merged dataframe.
Currently I do this in Excel with Index Match and *.
https://contexturesblog.com/archives/2012/10/09/find-text-with-index-and-match/
data1={'ID':['A','B','A','C'],'Reason':['xxxABC1','yyyBCD2','zzzXY3',"ppp123c"],'Amount':[10,20,30,10]}
df1=pd.DataFrame(data1)
df1
data2={'ID':['A','B','A','C'],'InvoiceID':['ABC1','BCD2','XY3',"123c"]}
df2=pd.DataFrame(data2)
df2
Expected Result
data3={'ID':['A','B','A','C'],'InvoiceID':['ABC1','BCD2','XY3',"123c"],'Amount':[10,20,30,10]}
df3=pd.DataFrame(data3)
df3
I used string extract with the following code and do get a filtered df1.
df4=(df1[df1['Reason'].str.contains('ABC1', regex=False)])
ID Reason Amount 0 A xxxABC1 10
I am not sure how to pull amount into df2 to create the new dataframe df3. I am thinking some loop/iteration over the rows is necessary (i.e. iterate through each row in df2, search for the specific text in each row of a certain column in df1 and add to df2 the amount if you find it. Then go through the next row in df2 and do this all over again).
I did not get fuzzy wuzzy to work for me but it could be because the reason column can be very lengthy when looking at actual data.