1

I have 2 csv files with UNEVEN structured data. I want to compare the 'STag' and 'E1' columns values of both the files using ITERATION and if they are equal then replace the respective 'E1_CUI' column value of 'file2_data' with 'E1_CUI' column value of 'file1_data'. I got the ValueError: Can only compare identically-labeled Series objects. For reference please consider below example.

CODE

import pandas as pd
file1_data = {
    'STag': ['Title_1', 'Title_1', 'Abs_1', 'Abs_3', 'Abs_3', 'Abs_4', 'Abs_4'],
    'E1': ['pacnes', 'PI', 'acne', 'pacnes', 'pI', 'kera', 'PPI'],
    'E1_CUI': ['C3477', 'C9871', 'C2166', 'C3477', 'C9871', 'C2567', 'C9871']
}
df1 = pd.DataFrame(file1_data)
df1

    E1      E1_CUI  STag
0   pacnes  C3477   Title_1
1   PI      C9871   Title_1
2   acne    C2166   Abs_1
3   pacnes  C3477   Abs_3
4   pI      C9871   Abs_3
5   kera    C2567   Abs_4
6   PPI     C9871   Abs_4

file2_data = {
    'STag': ['Title_1', 'Abs_1', 'Abs_3', 'Abs_4'],
    'E1': ['pacnes', 'acne', 'pI', 'kera'],
    'E1_CUI': [0, 0, 0, 0]
}
df2 = pd.DataFrame(file2_data)
df2

    E1      E1_CUI  STag
0   pacnes  0       Title_1
1   acne    0       Abs_1
2   pI      0       Abs_3
3   kera    0       Abs_4

for row,col in df1.iterrows():
    if (df1['STag'] == df2['STag'] and df1['E1'] == df2['E2']):
        df2['E1_CUI'] = df1['E1_CUI']

ERROR

ValueError: Can only compare identically-labeled Series objects

Expected Output DataFrame

    E1      E1_CUI  STag
0   pacnes  C3477   Title_1
1   acne    C2166   Abs_1
2   pI      C9871   Abs_3
3   kera    C2567   Abs_4
  • 1
    Do you need `df = df2.merge(df1, on=['STag','E1'], how='left')` ? Or `df = df2.drop('E1_CUI', axis=1).merge(df1, on=['STag','E1'], how='left')` ? – jezrael Mar 15 '21 at 07:13
  • 1
    @jezrael It doesn't matter in my scenario. So, I have used `df = df2.drop('E1_CUI', axis=1).merge(df1, on=['STag','E1'], how='left')` Thanks for the solution. – Sachin Sinkar Mar 15 '21 at 10:45

0 Answers0