I am performing matching (kind of fuzzy matching) between the company names of two data frames. For doing so, first I am performing full merge between all the company names, where the starting alphabet matches. Which means all the companies starting with 'A' would be matched to all the companies starting with 'A' in other data frame. This is done as follows:
df1['df1_Start'] = df1['company1'].astype(str).str.slice(0,2)
df2['df2_Start'] = df2['company2'].astype(str).str.slice(0,2)
Merge = pd.merge(df1,df2, left_on='df1_Start',right_on='df2_Start')
Now I want to have all the rows from FullMerge where company in df1 contains the company in df2. This is because companies in df1 have elongated names.
Merge1=Merge[Merge['company1'].str.contains(Merge['company2'].str)]
This isn't working for me. How do I perform this task? Also, Please suggest what other ways can I use to match the company names. Because companies might be same in two data frames but are not written in the exactly same way.