I'm trying to match two dataframes by the strings they have in common. This is the code for example:
import pandas as pd
import numpy as np
A = ['DF-PI-05', 'DF-PI-09', 'DF-PI-10', 'DF-PI-15', 'DF-PI-16',
'DF-PI-19', 'DF-PI-89', 'DF-PI-92', 'DF-PI-93', 'DF-PI-94',
'DF-PI-95', 'DF-PI-96', 'DF-PI-25', 'DF-PI-29', 'DF-PI-30',
'DF-PI-34', 'DF-PI-84']
B = ['PI-05', 'PI-10', 'PI-89', 'PI-90', 'PI-93', 'PI-94', 'PI-95',
'PI-96', 'PI-09', 'PI-15', 'PI-16', 'PI-19', 'PI-91A', 'PI-91b',
'PI-92', 'PI-25-CU', 'PI-29', 'PI-30', 'PI-34', 'PI-84-CU-S1',
'PI-84-CU-S2']
A = pd.DataFrame(A,columns=['ID'])
B = pd.DataFrame(B,columns=['Name'])
B['param_2'] = np.linspace(20,300,21)
A['param_1'] = np.linspace(0,20,17)
I would like to have ONE pandas dataframe with the shape (21,3) where the columns are ['Name','param_1',param_2]. where if the column 'ID' from pandas dataframe B matches the column 'Name' from pandas dataframe A, I get the values from the other columns in A and B, if the 'Name' doesn't match, I have NaN.
I have tried the code below but doesn't work.
B['param_1'] = np.nan
for date_B, row_B in B.iterrows():
for date_A, row_A in A.iterrows():
if row_B['Name'] in row_A['ID']:
row_B['param_1'] = row_A['param_1']
break
I also would prefer some solution without for loops, because I have a large dataset, perhaps with pandas.isin() function, I have tried but didn't manage to get the right result. Something like in this example:
Join dataframes based on partial string-match between columns