This thread here suggested to use reduce
to merge multiple data frames at once.
df1= pd.DataFrame({'key': ['A', 'B', 'C', 'D'], 'value': np.random.randn(4)})
df2= pd.DataFrame({'key': ['B', 'D', 'E', 'F'], 'value': np.random.randn(4)})
df3= pd.DataFrame({'key': ['A', 'C', 'E', 'F'], 'value': np.random.randn(4)})
df4= pd.DataFrame({'key': ['A', 'B', 'C', 'F'], 'value': np.random.randn(4)})
df_list = [df1, df2, df3, df4]
from functools import reduce
df_merged = reduce(lambda left,right: pd.merge(left,right,on=['key'], how='outer'), df_list )
The merge
is executed, but there is a warning
<stdin>:1: FutureWarning: Passing 'suffixes' which cause duplicate columns {'value_x'} in the result is deprecated and will raise a MergeError in a future version.
and df_meged
has columns with exactly duplicated names.
How do I force distinct names for each column?