1

Need to perform the following operation on a pandas dataframe df inside a for loop with 50 iterations or more:

Column'X' of df has to be merged with column 'X' of df1,
Column'Y' of df has to be merged with column 'Y' of df2,
Column'Z' of df has to be merged with column 'Z' of df3,
Column'W' of df has to be merged with column 'W' of df4

The columns which are common in all 5 dataframes - df, df1, df2, df3 and df4 are A, B, C and D

EDIT

The shape of all dataframes is different from one another where df is the master dataframe having maximum number of rows and rest all other 4 dataframes have number of rows less than df but varying from each other. So while merging columns need to make sure that rows from both dataframes are matched first.

Input df
A  B  C  D   X    Y    Z    W
1  2  3  4  nan  nan  nan  nan
2  3  4  5  nan  nan  nan  nan
5  9  7  8  nan  nan  nan  nan
4  8  6  3  nan  nan  nan  nan
df1
A  B  C  D   X    Y    Z    W
2  3  4  5  100  nan  nan  nan
4  8  6  3  200  nan  nan  nan
df2
A  B  C  D   X    Y    Z    W
1  2  3  4  nan  50  nan  nan
df3
A  B  C  D   X    Y    Z     W
1  2  3  4  nan  nan  1000  nan
4  8  6  3  nan  nan  2000  nan
df4
A  B  C  D   X    Y    Z    W
2  3  4  5  nan  nan  nan  25
5  9  7  8  nan  nan  nan  35
4  8  6  3  nan  nan  nan  45
Output df
A  B  C  D   X    Y    Z    W
1  2  3  4  nan  50   1000  nan
2  3  4  5  100  nan   nan  25
5  9  7  8  nan  nan   nan  35
4  8  6  3  200  nan  2000  45

Which is the most efficient and fastest way to achieve it? Tried using 4 separate combine_first statements but that doesn't seem to be the most efficient way. Can this be done by using just 1 line of code instead?

Any help will be appreciated. Many thanks in advance.

Prachi
  • 494
  • 3
  • 8
  • 21

0 Answers0