1

I have two Data frames df1( having columns C1,C2,etc) and df2(having columns S1,S2,etc)
I want to iterate through each column of both the Data Frames.
Currently I am doing the following thing:

df3=pd.Dataframe([])
for index1,row1 in df1.iterrows():
    for index2,row2 in df2.iterrows():
        if row1['C1']==row2['S1']:
            #perform Some Operations on each row like:
            df3 = df3.append(pd.DataFrame({'A': row2['S1'], 'B': row2['S2'],'C':functionCall(row1['c3'], row2['S3'])}, index=[0]), ignore_index=True)  

This works ok but it takes too much time.
I wanted to know, Is there a more efficient way of iterating through two Data Frames?

konkun
  • 21
  • 3

1 Answers1

1

I think need merge first, then apply function and last filter columns by subset - [[]]:

df3 = pd.merge(df1, df2, left_on='C1', right_on='S1')
df3['C'] = df3.apply(lambda x: functionCall(x['C3'], x['S3']), axis=1)
df3 = df3[['S1', 'S2', 'C']].rename(columns={'S1': 'A','S2': 'B'})
timodriaan
  • 55
  • 3
  • 8
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252