I have 2 csv files with UNEVEN structured data. I want to compare the 'STag' and 'E1' columns values of both the files using ITERATION and if they are equal then replace the respective 'E1_CUI' column value of 'file2_data' with 'E1_CUI' column value of 'file1_data'. I got the ValueError: Can only compare identically-labeled Series objects. For reference please consider below example.
CODE
import pandas as pd
file1_data = {
'STag': ['Title_1', 'Title_1', 'Abs_1', 'Abs_3', 'Abs_3', 'Abs_4', 'Abs_4'],
'E1': ['pacnes', 'PI', 'acne', 'pacnes', 'pI', 'kera', 'PPI'],
'E1_CUI': ['C3477', 'C9871', 'C2166', 'C3477', 'C9871', 'C2567', 'C9871']
}
df1 = pd.DataFrame(file1_data)
df1
E1 E1_CUI STag
0 pacnes C3477 Title_1
1 PI C9871 Title_1
2 acne C2166 Abs_1
3 pacnes C3477 Abs_3
4 pI C9871 Abs_3
5 kera C2567 Abs_4
6 PPI C9871 Abs_4
file2_data = {
'STag': ['Title_1', 'Abs_1', 'Abs_3', 'Abs_4'],
'E1': ['pacnes', 'acne', 'pI', 'kera'],
'E1_CUI': [0, 0, 0, 0]
}
df2 = pd.DataFrame(file2_data)
df2
E1 E1_CUI STag
0 pacnes 0 Title_1
1 acne 0 Abs_1
2 pI 0 Abs_3
3 kera 0 Abs_4
for row,col in df1.iterrows():
if (df1['STag'] == df2['STag'] and df1['E1'] == df2['E2']):
df2['E1_CUI'] = df1['E1_CUI']
ERROR
ValueError: Can only compare identically-labeled Series objects
Expected Output DataFrame
E1 E1_CUI STag
0 pacnes C3477 Title_1
1 acne C2166 Abs_1
2 pI C9871 Abs_3
3 kera C2567 Abs_4